Inht26 transgenic soybean

ABSTRACT

Transgenic INHT26 soybean plants comprising modifications of the DAS44406-6 soybean locus which provide for facile excision of the modified DAS44406-6 transgenic locus or portions thereof, methods of making such plants, and use of such plants to facilitate breeding are disclosed.

REFERENCE TO SEQUENCE LISTING SUBMITTED ELECTRONICALLY

The sequence listing contained in the file named “P13422US03.XML,”created on Jan. 30, 2023 and electronically filed on Jan. 31, 2023, is54,433 bytes in size and incorporated herein by reference in itsentirety.

BACKGROUND

Transgenes which are placed into different positions in the plant genomethrough non-site specific integration can exhibit different levels ofexpression (Weising et al., 1988, Ann. Rev. Genet. 22:421-477). Suchtransgene insertion sites can also contain various undesirablerearrangements of the foreign DNA elements that include deletions and/orduplications. Furthermore, many transgene insertion sites can alsocomprise selectable or scoreable marker genes which in some instancesare no longer required once a transgenic plant event containing thelinked transgenes which confer desirable traits are selected.

Commercial transgenic plants typically comprise one or more independentinsertions of transgenes at specific locations in the host plant genomethat have been selected for features that include expression of thetransgene(s) of interest and the transgene-conferred trait(s), absenceor minimization of rearrangements, and normal Mendelian transmission ofthe trait(s) to progeny. An example of a selected transgenic soybeanevent which confers tolerance to glyphosate, glufosinate,2,4-dichlorophenoxyacetic acid and pyridyloxyacetate herbicides is theDAS44406-6 transgenic soybean event disclosed in U.S. Patent No.9,540,655. DAS44406-6 transgenic soybean plants express a 2mepspsprotein which can confer tolerance to glyphosate, a phosphinotricinacetyl transferase (PAT) protein which confers tolerance to theherbicide glufosinate, and a aryloxyalkanoate dioxygenase (AAD-12)protein which confers tolerance to 2,4-dichlorophenoxyacetic acid andpyridyloxyacetate herbicides.

Methods for removing selectable marker genes and/or duplicatedtransgenes in transgene insertion sites in plant genomes involving useof site-specific recombinase systems (e.g., cre-lox) as well as forinsertion of new genes into transgene insertion sites have beendisclosed (Srivastava and Ow; Methods Mol Biol, 2015,1287:95-103; Daleand Ow, 1991, Proc. Natl Acad. Sci. USA 88, 10558-10562; Srivastava andThomson, Plant Biotechnol J, 2016;14(2):471-82). Such methods typicallyrequire incorporation of the recombination site sequences recognized bythe recombinase at particular locations within the transgene.

SUMMARY

Transgenic soybean plant cells comprising an INHT26 transgenic locuscomprising an originator guide RNA recognition site (OgRRS) in a firstDNA junction polynucleotide of a DAS44406-6 transgenic locus and acognate guide RNA recognition site (CgRRS) in a second DNA junctionpolynucleotide of the DAS44406-6 transgenic locus are provided.Transgenic soybean plant cells comprising an INHT26 transgenic locuscomprising an insertion and/or substitution in a DNA junctionpolynucleotide of a DAS44406-6 transgenic locus of DNA comprising acognate guide RNA recognition site (CgRRS) are provided. In certainembodiments, the DAS44406-6 transgenic locus is set forth in SEQ IDNO:1, is present in seed deposited at the ATCC under accession No.PTA-11336 is present in progeny thereof, is present in allelic variantsthereof, or is present in other variants thereof. INHT26 transgenicsoybean plant cells, transgenic soybean plant seeds, and transgenicsoybean plants all comprising a transgenic locus set forth in SEQ ID NO:14 are provided. Transgenic soybean plant parts including seeds andtransgenic soybean plants comprising the soybean plant cells are alsoprovided.

Methods for obtaining a bulked population of inbred seed comprisingselfing the aforementioned transgenic soybean plants and harvesting seedcomprising the INHT26 transgenic locus from the selfed soybean plant arealso provided.

Methods of obtaining hybrid soybean seed comprising crossing theaforementioned transgenic soybean plants to a second soybean plant whichis genetically distinct from the first soybean plant and harvesting seedcomprising the INHT26 transgenic locus from the cross are provided.Methods for obtaining a bulked population of seed comprising selfing atransgenic soybean plant of comprising SEQ ID NO: 14 and harvestingtransgenic seed comprising the transgenic locus set forth in SEQ ID NO:14 are provided.

A DNA molecule comprising SEQ ID NO: 14, 16, 17, or an allelic variantthereof is provided. Processed transgenic soybean plant products andbiological samples comprising the DNA molecules are provided. Nucleicacid molecules adapted for detection of genomic DNA comprising the DNAmolecules, wherein said nucleic acid molecule optionally comprises adetectable label are provided. Methods of detecting a soybean plant cellcomprising the INHT26 transgenic locus of any one of claims 1 to 3,comprising the step of detecting a DNA molecule comprising SEQ ID NO:14, 16, or 17 are provided.

Methods of excising the INHT26 transgenic locus from the genome of theaforementioned soybean plant cells comprising the steps of:(a)contacting the edited transgenic plant genome of the plant cell with:(i) an RNA dependent DNA endonuclease (RdDe); and (ii) a guide RNA(gRNA) capable of hybridizing to the guide RNA hybridization site of theOgRRS and the CgRRS; wherein the RdDe recognizes a OgRRS/gRNA and aCgRRS/gRNA hybridization complex; and, (b) selecting a transgenic plantcell, transgenic plant part, or transgenic plant wherein the INHT26transgenic locus flanked by the OgRRS and the CgRRS has been excised.

BRIEF DESCRIPTION OF THE DRAWINGS/FIGURES

FIGS. 1A-D shows a sequence (SEQ ID NO: 1) of the DAS44406-6 eventtransgenic locus including the endogenous genomic DNA (uppercase),transgenic insert DNA (lowercase) and 5′ and 3′ junction sequencesflanking the transgenic insert DNA. The OgRRS sequence comprising theProtospacer Adjacent Motif (PAM) site (TTTA) and gRNA hybridization site(i.e., protospacer sequence; SEQ ID NO: 19) in the genomic DNA of the 5′junction sequence is shown in bold and underlined. The PAM sites andgRNA hybridization sites for the Guide-2 (SEQ ID NO: 5), Guide-3 (SEQ IDNO: 6), and Guide-5 (SEQ ID NO: 8) gRNAs which are located in or spanthe 3′ junction polynucleotide sequence are in italics and doubleunderlined. The Guide-2 and Guide-3 gRNAs are directed to transgenic DNAlocated 5′ to the transgene/soybean genomic DNA junction. The PAM siteand gRNA hybridization site for the Guide-5 gRNA span thetransgene/soybean genomic DNA junction of the 3′ junctionpolynucleotide.

FIGS. 2A-D shows a sequence (SEQ ID NO: 14) of the INHT26 transgeniclocus including the endogenous genomic DNA (uppercase) and transgenicinsert DNA (lowercase) as well as the 5′ and 3′ junction sequencesflanking the inserted transgenic DNA. The OgRRS sequence comprising thePAM site (TTTA) and gRNA hybridization site (i.e., protospacer sequence;SEQ ID NO: 19) in the genomic DNA of the 5′ junction sequence is shownin bold and underlined. A CgRRS comprising the PAM site (TTTA) and gRNAhybridization site (i.e., protospacer sequence; SEQ ID NO: 19) locatedin the endogenous genomic DNA of the 3′ junction polynucleotide is alsoshown in bold and underlined. The CgRRS as depicted can be introducedinto the 3′ junction polynucleotide as shown by using the Guide-5 gRNAhybridization site of SEQ ID NO: 8, a suitable Cas RdDe (e.g., a Cas12anuclease of SEQ ID NO: 15), and the donor DNA template of SEQ ID NO: 11.The INHT26 transgenic locus can be excised with a single guide RNA whichhybridizes to the SEQ ID NO: 19 gRNA hybridization site and a suitableCas RdDe (e.g., a Cas12a nuclease of SEQ ID NO: 15) which will cleaveDNA in both the OgRRS which flanks the 5′ end of the INHT26 transgeniclocus and the OgRRS which flanks the 3′ end of the INHT26 transgeniclocus.

FIG. 3 shows a schematic diagram which compares current breedingstrategies for introgression of transgenic events (i.e., transgenicloci) to alternative breeding strategies for introgression of transgenicevents where the transgenic events (i.e., transgenic loci) can beremoved following introgression to provide different combinations oftransgenic traits. In FIG. 3 , “GE” refers to genome editing (e.g.,including introduction of targeted genetic changes with genome editingmolecules) and “Event Removal” refers to excision of a transgenic locus(i.e., an “Event”) with genome editing molecules.

FIGS. 4A, B, C. FIG. 4A shows a schematic diagram of a non-limitingexample of: (i) an untransformed plant chromosome containingnon-transgenic DNA which includes the originator guide RNA recognitionsite (OgRRS) (top); (ii) the original transgenic locus with the OgRRS inthe non-transgenic DNA of the 1^(st) junction polynucleotide (middle);and (iii) the modified transgenic locus with a cognate guide RNAinserted into the non-transgenic DNA of the 2^(nd) junctionpolynucleotide (bottom). FIG. 4B shows a schematic diagram of anon-limiting example of a process where a modified transgenic locus witha cognate guide RNA inserted into the non-transgenic DNA of the 2^(nd)junction polynucleotide (top) is subjected to cleavage at the OgRRS andCgRRS with one guide RNA (gRNA) that hybridizes to gRNA hybridizationsite in both the OgRRS and the CgRRS and an RNA dependent DNAendonuclease (RdDe) that recognizes and cleaves the gRNA/OgRRS and thegRNA/CgRRS complex followed by non-homologous end joining processes toprovide a plant chromosome where the transgenic locus is excised. FIG.4C shows a schematic diagram of a non-limiting example of a processwhere a modified transgenic locus with a cognate guide RNA inserted intothe non-transgenic DNA of the 2^(nd) j unction polynucleotide (top) issubjected to cleavage at the OgRRS and CgRRS with one guide RNA (gRNA)that hybridizes to the gRNA hybridization site in both the OgRRS and theCgRRS and an RNA dependent DNA endonuclease (RdDe) that recognizes andcleaves the gRNA/OgRRS and the gRNA/CgRRS complex in the presence of adonor DNA template. In FIG. 4C, cleavage of the modified transgeniclocus in the presence of the donor DNA template which has homology tonon-transgenic DNA but lacks the OgRRS in the 1^(st) and 2^(nd) junctionpolynucleotides followed by homology-directed repair processes toprovide a plant chromosome where the transgenic locus is excised andnon-transgenic DNA present in the untransformed plant chromosome is atleast partially restored.

DETAILED DESCRIPTION

Unless otherwise stated, nucleic acid sequences in the text of thisspecification are given, when read from left to right, in the 5′ to 3′direction. Nucleic acid sequences may be provided as DNA or as RNA, asspecified; disclosure of one necessarily defines the other, as well asnecessarily defines the exact complements, as is known to one ofordinary skill in the art.

Where a term is provided in the singular, the inventors also contemplateembodiments described by the plural of that term.

The term “about” as used herein means a value or range of values whichwould be understood as an equivalent of a stated value and can begreater or lesser than the value or range of values stated by 10percent. Each value or range of values preceded by the term “about” isalso intended to encompass the embodiment of the stated absolute valueor range of values.

The phrase “allelic variant” as used herein refers to a polynucleotideor polypeptide sequence variant that occurs in a different strain,variety, or isolate of a given organism.

The term “and/or” where used herein is to be taken as specificdisclosure of each of the two specified features or components with orwithout the other. Thus, the term and/or″ as used in a phrase such as “Aand/or B” herein is intended to include “A and B,” “A or B,” “A”(alone), and “B” (alone). Likewise, the term “and/or” as used in aphrase such as “A, B, and/or C” is intended to encompass each of thefollowing embodiments: A, B, and C; A, B, or C; A or C; A or B; B or C;A and C; A and B; B and C; A (alone); B (alone); and C (alone).

As used herein, the phrase “approved transgenic locus” is a geneticallymodified plant event which has been authorized, approved, and/orde-regulated for any one of field testing, cultivation, humanconsumption, animal consumption, and/or import by a governmental body.Illustrative and non-limiting examples of governmental bodies whichprovide such approvals include the Ministry of Agriculture of Argentina,Food Standards Australia New Zealand, National Biosafety TechnicalCommittee (CTNBio) of Brazil, Canadian Food Inspection Agency, ChinaMinistry of Agriculture Biosafety Network, European Food SafetyAuthority, US Department of Agriculture, US Department of EnvironmentalProtection, and US Food and Drug Administration.

The term “backcross”, as used herein, refers to crossing an F1 plant orplants with one of the original parents. A backcross is used to maintainor establish the identity of one parent (species) and to incorporate aparticular trait from a second parent (species). The term “backcrossgeneration”, as used herein, refers to the offspring of a backcross.

As used herein, the phrase “biological sample” refers to either intactor non-intact (e.g., milled seed or plant tissue, chopped plant tissue,lyophilized tissue) plant tissue. It may also be an extract comprisingintact or non-intact seed or plant tissue. The biological sample cancomprise flour, meal, syrup, oil, starch, and cereals manufactured inwhole or in part to contain crop plant by-products. In certainembodiments, the biological sample is “non-regenerable” (i.e., incapableof being regenerated into a plant or plant part). In certainembodiments, the biological sample refers to a homogenate, an extract,or any fraction thereof containing genomic DNA of the organism fromwhich the biological sample was obtained, wherein the biological sampledoes not comprise living cells.

As used herein, the terms “correspond,” “corresponding,” and the like,when used in the context of an nucleotide position, mutation, and/orsubstitution in any given polynucleotide (e.g., an allelic variant ofSEQ ID NO: 1) with respect to the reference polynucleotide sequence(e.g., SEQ ID NO: 1) all refer to the position of the polynucleotideresidue in the given sequence that has identity to the residue in thereference nucleotide sequence when the given polynucleotide is alignedto the reference polynucleotide sequence using a pairwise alignmentalgorithm (e.g., CLUSTAL O 1.2.4 with default parameters).

As used herein, the terms “Cpfl” and “Cas12a” are used interchangeablyto refer to the same RNA dependent DNA endonuclease (RdDe). A Cas12aprotein provided herein includes the protein of SEQ ID NO: 15.

The term “crossing” as used herein refers to the fertilization of femaleplants (or gametes) by male plants (or gametes). The term “gamete”refers to the haploid reproductive cell (egg or pollen) produced inplants by meiosis from a gametophyte and involved in sexualreproduction, during which two gametes of opposite sex fuse to form adiploid zygote. The term generally includes reference to a pollen(including the sperm cell) and an ovule (including the ovum). Whenreferring to crossing in the context of achieving the introgression of agenomic region or segment, the skilled person will understand that inorder to achieve the introgression of only a part of a chromosome of oneplant into the chromosome of another plant, random portions of thegenomes of both parental lines recombine during the cross due to theoccurrence of crossing-over events in the production of the gametes inthe parent lines. Therefore, the genomes of both parents must becombined in a single cell by a cross, where after the production ofgametes from the cell and their fusion in fertilization will result inan introgression event.

As used herein, the phrases “DNA junction polynucleotide” and “junctionpolynucleotide” refers to a polynucleotide of about 18 to about 500 basepairs in length comprised of both endogenous chromosomal DNA of theplant genome and heterologous transgenic DNA which is inserted in theplant genome. A junction polynucleotide can thus comprise about 8, 10,20, 50, 100, 200, 250, 500, or 1000 base pairs of endogenous chromosomalDNA of the plant genome and about 8, 10, 20, 50, 100, 200, 250, 500, or1000 base pairs of heterologous transgenic DNA which span the one end ofthe transgene insertion site in the plant chromosomal DNA. Transgeneinsertion sites in chromosomes will typically contain both a 5′ junctionpolynucleotide and a 3′ junction polynucleotide. In embodiments setforth herein in SEQ ID NO: 1, the 5′ junction polynucleotide is locatedat the 5′ end of the sequence and the 3′ junction polynucleotide islocated at the 3′ end of the sequence. In a non-limiting andillustrative example, a 5′ junction polynucleotide of a transgenic locusis telomere proximal in a chromosome arm and the 3′ junctionpolynucleotide of the transgenic locus is centromere proximal in thesame chromosome arm. In another non-limiting and illustrative example, a5′ junction polynucleotide of a transgenic locus is centromere proximalin a chromosome arm and the 3′ junction polynucleotide of the transgeniclocus is telomere proximal in the same chromosome arm. The junctionpolynucleotide which is telomere proximal and the junctionpolynucleotide which is centromere proximal can be determined bycomparing non-transgenic genomic sequence of a sequenced non-transgenicplant genome to the non-transgenic DNA in the junction polynucleotides.

The term “donor,” as used herein in the context of a plant, refers tothe plant or plant line from which the trait, transgenic event, orgenomic segment originates, wherein the donor can have the trait,introgression, or genomic segment in either a heterozygous or homozygousstate.

As used herein, the term “DAS44406-6” is used to refer to any of atransgenic soybean locus, transgenic soybean plants and parts thereofincluding seed set forth in US Pat. No. 9,540,655, which is incorporatedherein by reference in its entirety. Representative DAS44406-6transgenic soybean seed have been deposited with American Type CultureCollection (ATCC, Manassas, Va. 20110-2209 USA) under Accession No.PTA-11336. DAS44406-6 transgenic loci include loci having the sequenceof SEQ ID NO:1, the sequence of the DAS44406-6 locus in the depositedseed of Accession No. PTA-11336 and any progeny thereof, as well asallelic variants and other variants of SEQ ID NO:1.

As used herein, the terms “excise” and “delete,” when used in thecontext of a DNA molecule, are used interchangeably to refer to theremoval of a given DNA segment or element (e.g., transgene element ortransgenic locus or portion thereof) of the DNA molecule.

As used herein, the phrase “elite crop plant” refers to a plant whichhas undergone breeding to provide one or more trait improvements. Elitecrop plant lines include plants which are an essentially homozygous,e.g., inbred or doubled haploid. Elite crop plants can include inbredlines used as is or used as pollen donors or pollen recipients in hybridseed production (e.g., used to produce F1 plants). Elite crop plants caninclude inbred lines which are selfed to produce non-hybrid cultivars orvarieties or to produce (e.g., bulk up) pollen donor or recipient linesfor hybrid seed production. Elite crop plants can include hybrid F1progeny of a cross between two distinct elite inbred or doubled haploidplant lines.

As used herein, an “event,” “a transgenic event,” “a transgenic locus”and related phrases refer to an insertion of one or more transgenes at aunique site in the genome of a plant as well as to DNA fragments, plantcells, plants, and plant parts (e.g., seeds) comprising genomic DNAcontaining the transgene insertion. Such events typically comprise botha 5′ and a 3′ junction polynucleotide and confer one or more usefultraits including herbicide tolerance, insect resistance, male sterility,and the like.

As used herein, the phrases “endogenous sequence,” “endogenous gene,”“endogenous DNA,” “endogenous polynucleotide,” and the like refer to thenative form of a polynucleotide, gene or polypeptide in its naturallocation in the organism or in the genome of an organism.

The terms “exogenous” and “heterologous” as are used synonymously hereinto refer to any polynucleotide (e.g., DNA molecule) that has beeninserted into a new location in the genome of a plant. Non-limitingexamples of an exogenous or heterologous DNA molecule include asynthetic DNA molecule, a non-naturally occurring DNA molecule, a DNAmolecule found in another species, a DNA molecule found in a differentlocation in the same species, and/or a DNA molecule found in the samestrain or isolate of a species, where the DNA molecule has been insertedinto a new location in the genome of a plant.

As used herein, the term “F1” refers to any offspring of a cross betweentwo genetically unlike individuals.

The term “gene,” as used herein, refers to a hereditary unit consistingof a sequence of DNA that occupies a specific location on a chromosomeand that contains the genetic instruction for a particularcharacteristics or trait in an organism. The term “gene” thus includes anucleic acid (for example, DNA or RNA) sequence that comprises codingsequences necessary for the production of an RNA, or a polypeptide orits precursor. A functional polypeptide can be encoded by a full lengthcoding sequence or by any portion of the coding sequence as long as thedesired activity or functional properties (e.g., enzymatic activity,pesticidal activity, ligand binding, and/or signal transduction) of theRNA or polypeptide are retained.

The term “identifying,” as used herein with respect to a plant, refersto a process of establishing the identity or distinguishing character ofa plant, including exhibiting a certain trait, containing one or moretransgenes, and/or containing one or more molecular markers.

As used herein, the term “INHT26” is used to refer either individuallycollectively to items that include any or all of the DAS44406-6transgenic soybean loci which have been modified as disclosed herein,modified DAS44406-6 transgenic soybean plants and parts thereofincluding seed, and DNA obtained therefrom.

The term “isolated” as used herein means having been removed from itsnatural environment.

As used herein, the terms “include,” “includes,” and “including” are tobe construed as at least having the features to which they refer whilenot excluding any additional unspecified features.

As used herein, the phrase “introduced transgene” is a transgene notpresent in the original transgenic locus in the genome of an initialtransgenic event or in the genome of a progeny line obtained from theinitial transgenic event. Examples of introduced transgenes includeexogenous transgenes which are inserted in a resident originaltransgenic locus.

As used herein, the terms “introgression”, “introgressed” and“introgressing” refer to both a natural and artificial process, and theresulting plants, whereby traits, genes or DNA sequences of one species,variety or cultivar are moved into the genome of another species,variety or cultivar, by crossing those species. The process mayoptionally be completed by backcrossing to the recurrent parent.Examples of introgression include entry or introduction of a gene, atransgene, a regulatory element, a marker, a trait, a trait locus, or achromosomal segment from the genome of one plant into the genome ofanother plant.

The phrase “marker-assisted selection”, as used herein, refers to thediagnostic process of identifying, optionally followed by selecting aplant from a group of plants using the presence of a molecular marker asthe diagnostic characteristic or selection criterion. The processusually involves detecting the presence of a certain nucleic acidsequence or polymorphism in the genome of a plant.

The phrase “molecular marker”, as used herein, refers to an indicatorthat is used in methods for visualizing differences in characteristicsof nucleic acid sequences. Examples of such indicators are restrictionfragment length polymorphism (RFLP) markers, amplified fragment lengthpolymorphism (AFLP) markers, single nucleotide polymorphisms (SNPs),microsatellite markers (e.g. SSRs), sequence-characterized amplifiedregion (SCAR) markers, Next Generation Sequencing (NGS) of a molecularmarker, cleaved amplified polymorphic sequence (CAPS) markers or isozymemarkers or combinations of the markers described herein which defines aspecific genetic and chromosomal location.

As used herein the terms “native” or “natural” define a condition foundin nature. A “native DNA sequence” is a DNA sequence present in naturethat was produced by natural means or traditional breeding techniquesbut not generated by genetic engineering (e.g., using molecularbiology/transformation techniques).

The term “offspring”, as used herein, refers to any progeny generationresulting from crossing, selfing, or other propagation technique.

The phrase “operably linked” refers to a juxtaposition wherein thecomponents so described are in a relationship permitting them tofunction in their intended manner. For instance, a promoter is operablylinked to a coding sequence if the promoter affects its transcription orexpression. When the phrase “operably linked” is used in the context ofa PAM site and a guide RNA hybridization site, it refers to a PAM sitewhich permits cleavage of at least one strand of DNA in a polynucleotidewith an RNA dependent DNA endonuclease or RNA dependent DNA nickasewhich recognize the PAM site when a guide RNA complementary to guide RNAhybridization site sequences adjacent to the PAM site is present. AOgRRS and its CgRRS are operably linked to junction polynucleotides whenthey can be recognized by a gRNA and an RdDe to provide for excision ofthe transgenic locus or portion thereof flanked by the junctionpolynucleotides.

As used herein, the term “plant” includes a whole plant and anydescendant, cell, tissue, or part of a plant. The term “plant parts”include any part(s) of a plant, including, for example and withoutlimitation: seed (including mature seed and immature seed); a plantcutting; a plant cell; a plant cell culture; or a plant organ (e.g.,pollen, embryos, flowers, fruits, shoots, leaves, roots, stems, andexplants). A plant tissue or plant organ may be a seed, protoplast,callus, or any other group of plant cells that is organized into astructural or functional unit. A plant cell or tissue culture may becapable of regenerating a plant having the physiological andmorphological characteristics of the plant from which the cell or tissuewas obtained, and of regenerating a plant having substantially the samegenotype as the plant. Regenerable cells in a plant cell or tissueculture may be embryos, protoplasts, meristematic cells, callus, pollen,leaves, anthers, roots, root tips, silk, flowers, kernels, ears, cobs,husks, or stalks. In contrast, some plant cells are not capable of beingregenerated to produce plants and are referred to herein as“non-regenerable” plant cells.

The term “purified,” as used herein defines an isolation of a moleculeor compound in a form that is substantially free of contaminantsnormally associated with the molecule or compound in a native or naturalenvironment and means having been increased in purity as a result ofbeing separated from other components of the original composition. Theterm “purified nucleic acid” is used herein to describe a nucleic acidsequence which has been separated from other compounds including, butnot limited to polypeptides, lipids and carbohydrates.

The term “recipient”, as used herein, refers to the plant or plant linereceiving the trait, transgenic event or genomic segment from a donor,and which recipient may or may not have the have trait, transgenic eventor genomic segment itself either in a heterozygous or homozygous state.

As used herein the term “recurrent parent” or “recurrent plant”describes an elite line that is the recipient plant line in a cross andwhich will be used as the parent line for successive backcrosses toproduce the final desired line.

As used herein the term “recurrent parent percentage” relates to thepercentage that a backcross progeny plant is identical to the recurrentparent plant used in the backcross. The percent identity to therecurrent parent can be determined experimentally by measuring geneticmarkers such as SNPs and/or RFLPs or can be calculated theoreticallybased on a mathematical formula.

The terms “selfed,” “selfing,” and “self,” as used herein, refer to anyprocess used to obtain progeny from the same plant or plant line as wellas to plants resulting from the process. As used herein, the terms thusinclude any fertilization process wherein both the ovule and pollen arefrom the same plant or plant line and plants resulting therefrom.Typically, the terms refer to self-pollination processes and progenyplants resulting from self-pollination.

The term “selecting”, as used herein, refers to a process of picking outa certain individual plant from a group of individuals, usually based ona certain identity, trait, characteristic, and/or molecular marker ofthat individual.

As used herein, the phrase “originator guide RNA recognition site” orthe acronym “OgRRS” refers to an endogenous DNA polynucleotidecomprising a protospacer adjacent motif (PAM) site operably linked to aguide RNA hybridization site (i.e., protospacer sequence). In certainembodiments, an OgRRS can be located in an untransformed plantchromosome or in non-transgenic DNA of a DNA junction polynucleotide ofboth an original transgenic locus and a modified transgenic locus. Incertain embodiments, an OgRRS can be located in transgenic DNA of a DNAjunction polynucleotide of both an original transgenic locus and amodified transgenic locus. In certain embodiments, an OgRRS can belocated in both transgenic DNA and non-transgenic DNA of a DNA junctionpolynucleotide of both an original transgenic locus and a modifiedtransgenic locus (i.e., can span transgenic and non-transgenic DNA in aDNA junction polynucleotide).

As used herein the phrase “cognate guide RNA recognition site” or theacronym “CgRRS” refer to a DNA polynucleotide comprising a PAM siteoperably linked to a guide RNA hybridization site (i.e., protospacersequence), where the CgRRS is absent from transgenic plant genomescomprising a first original transgenic locus that is unmodified andwhere the CgRRS and its corresponding OgRRS can hybridize to a singlegRNA. A CgRRS can be located in transgenic DNA of a DNA junctionpolynucleotide of a modified transgenic locus, in transgenic DNA of aDNA junction polynucleotide of a modified transgenic locus, or in bothtransgenic and non-transgenic DNA of a modified transgenic locus (i.e.,can span transgenic and non-transgenic DNA in a DNA junctionpolynucleotide).

As used herein, the phrase “a transgenic locus excision site” refers tothe DNA which remains in the genome of a plant or in a DNA molecule(e.g., an isolated or purified DNA molecule) wherein a segmentcomprising, consisting essentially of, or consisting of a transgeniclocus has been deleted. In a non-limiting and illustrative example, atransgenic locus excision site can thus comprise a contiguous segment ofDNA comprising at least 10 base pairs of DNA that is telomere proximalto the deleted transgenic locus or to the deleted segment of thetransgenic locus and at least 10 base pairs of DNA that is centromereproximal to the deleted transgenic locus or to the deleted segment ofthe transgenic locus.

As used herein, the phrase “transgene element” refers to a segment ofDNA comprising, consisting essentially of, or consisting of a promoter,a 5′ UTR, an intron, a coding region, a 3′UTR, or a polyadenylationsignal. Polyadenylation signals include transgene elements referred toas “terminators” (e.g., NOS, pinII, rbcs, Hsp17, TubA).

To the extent to which any of the preceding definitions is inconsistentwith definitions provided in any patent or non-patent referenceincorporated herein by reference, any patent or non-patent referencecited herein, or in any patent or non-patent reference found elsewhere,it is understood that the preceding definition will be used herein.

Genome editing molecules can permit introduction of targeted geneticchange conferring desirable traits in a variety of crop plants (Zhang etal. Genome Biol. 2018; 19: 210; Schindele et al. FEBS Lett.2018;592(12): 1954). Desirable traits introduced into crop plants suchas soybean and soybean include herbicide tolerance, improved food and/orfeed characteristics, male-sterility, and drought stress tolerance.Nonetheless, full realization of the potential of genome editing methodsfor crop improvement will entail efficient incorporation of the targetedgenetic changes in germplasm of different elite crop plants adapted fordistinct growing conditions. Such elite crop plants will also desirablycomprise useful transgenic loci which confer various traits includingherbicide tolerance, pest resistance (e.g.; insect, nematode, fungaldisease, and bacterial disease resistance), conditional male sterilitysystems for hybrid seed production, abiotic stress tolerance (e.g.,drought tolerance), improved food and/or feed quality, and improvedindustrial use (e.g., biofuel). Provided herein are methods wherebytargeted genetic changes are efficiently combined with desired subsetsof transgenic loci in elite progeny plant lines (e.g., elite inbredsused for hybrid seed production or for inbred varietal production). Alsoprovided are plant genomes containing modified transgenic loci which canbe selectively excised with a single gRNA molecule. Such modifiedtransgenic loci comprise an originator guide RNA recognition site(OgRRS) which is identified in non-transgenic DNA of a first junctionpolynucleotide of the transgenic locus and cognate guide RNA recognitionsite (CgRRS) which is introduced (e.g., by genome editing methods) intoa second junction polynucleotide of the transgenic locus and which canhybridize to the same gRNA as the OgRRS, thereby permitting excision ofthe modified transgenic locus with a single guide RNA. An originatorguide RNA recognition site (OgRRS) comprises endogenous DNA found inuntransformed plants and in endogenous non-transgenic DNA of junctionpolynucleotides of transgenic plants containing a modified or unmodifiedtransgenic locus. The OgRRS located in non-transgenic DNA of a first DNAjunction polynucleotide is used to design a related cognate guide RNArecognition site (CgRRS) which is introduced (e.g., by genome editingmethods) into the second junction polynucleotide of the transgeniclocus. A CgRRS is thus present in junction polynucleotides of modifiedtransgenic loci provided herein and is absent from endogenous DNA foundin untransformed plants and absent from endogenous non-transgenic DNAfound in junction sequences of transgenic plants containing anunmodified transgenic locus. Also provided are unique transgenic locusexcision sites created by excision of such modified transgenic loci, DNAmolecules comprising the modified transgenic loci, unique transgeniclocus excision sites and/or plants comprising the same, biologicalsamples containing the DNA, nucleic acid markers adapted for detectingthe DNA molecules, and related methods of identifying the elite cropplants comprising unique transgenic locus excision sites.

Also provided herein are methods whereby targeted genetic changes areefficiently combined with desired subsets of transgenic loci in eliteprogeny plant lines (e.g., elite inbreds used for hybrid seed productionor for inbred varietal production). Examples of such methods includethose illustrated in FIG. 3 . In certain embodiments, INHT26 transgenicloci provided here are characterized by polynucleotide sequences thatcan facilitate as necessary the removal of the INHT26 transgenic locifrom the genome. Useful applications of such INHT26 transgenic loci andrelated methods of making include targeted excision of a INHT26transgenic locus or portion thereof in certain breeding lines tofacilitate recovery of germplasm with subsets of transgenic traitstailored for specific geographic locations and/or grower preferences.Other useful applications of such INHT26 transgenic loci and relatedmethods of making include removal of transgenic traits from certainbreeding lines when it is desirable to replace the trait in the breedingline without disrupting other transgenic loci and/or non-transgenicloci. In certain embodiments, soybean genomes containing INHT26transgenic loci or portions thereof which can be selectively excisedwith one or more gRNA molecules and RdDe (RNA dependent DNAendonucleases) which form gRNA/target DNA complexes. Such selectivelyexcisable INHT26 transgenic loci can comprise an originator guide RNArecognition site (OgRRS) which is identified in non-transgenic DNA,transgenic DNA, or a combination thereof in of a first junctionpolynucleotide of the transgenic locus and cognate guide RNA recognitionsite (CgRRS) which is introduced (e.g., by genome editing methods) intoa second junction polynucleotide of the transgenic locus and which canhybridize to the same gRNA as the OgRRS, thereby permitting excision ofthe modified transgenic locus or portions thereof with a single guideRNA (e.g., as shown in FIGS. 3A and B). In certain embodiments, anoriginator guide RNA recognition site (OgRRS) comprises endogenous DNAfound in untransformed plants and in endogenous non-transgenic DNA ofjunction polynucleotides of transgenic plants containing a modified orunmodified transgenic locus. In certain embodiments, an originator guideRNA recognition site (OgRRS) comprises exogenous transgenic DNA ofjunction polynucleotides of transgenic plants containing a modified orunmodified transgenic locus. The OgRRS located in non-transgenic DNAtransgenic DNA, or a combination thereof in of a first DNA junctionpolynucleotide is used to design a related cognate guide RNA recognitionsite (CgRRS) which is introduced (e.g., by genome editing methods) intothe second junction polynucleotide of the transgenic locus. A CgRRS isthus present injunction polynucleotides of modified transgenic lociprovided herein and is absent from endogenous DNA found in untransformedplants and absent from junction sequences of transgenic plantscontaining an unmodified transgenic locus. A CgRRS is also absent from acombination of non-transgenic and transgenic DNA found injunctionsequences of transgenic plants containing an unmodified transgeniclocus. An example of OgRRS polynucleotide sequences in or near a 5′junction polynucleotide in an DAS44406-6 transgenic locus include SEQ IDNO: 18, which is shown in bold and underlined in FIG. 1 . OgRRSpolynucleotide sequences located in a first junction polynucleotide canbe introduced into the second junction polynucleotide using donor DNAtemplates as illustrated in FIG. 4C and as elsewhere described herein. Adonor DNA template for introducing the SEQ ID NO: 18 OgRRS into the 3′junction polynucleotide of an DAS44406-6 locus includes the donor DNAtemplate comprising SEQ ID NO: 11. Double stranded breaks in a 3′junction polynucleotide of SEQ ID NO: 1 can be introduced with theGuide-1, 2, 3, 4, and/or 5 gRNAs, which are respectively encoded by SEQID NO: 4, 5, 6, 7, and/or 8, and a Cas12a nuclease. In certainembodiments, double stranded breaks in a 3′ junction polynucleotide ofSEQ ID NO: 1 can be introduced with the Guide-1 or 2 gRNAs and any oneof the Guide-3, 4, and/or 5 gRNAs and a Cas12a nuclease (e.g., aCas12anuclease of SEQ ID NO: 15). Integration of the SEQ ID NO: 11 donorDNA template comprising the CgRRS into the 3′ junction polynucleotide ofan DAS44406-6 locus at the double stranded breaks introduced by thegRNAs encoded by SEQ ID NO: 4, 5, 6, 7, and/or 8 and a Cas12a nucleasecan provide an INHT26 locus comprising the CgRRS sequence set forth inSEQ ID NO: 14. A subsequence comprising the CgRRS which is located inthe 3′ junction polynucleotide of the INHT26 transgenic locus is setforth in SEQ ID NO: 17. Double stranded breaks in a 3′ junctionpolynucleotide of SEQ ID NO: 1 can be introduced with gRNAs encoded bySEQ ID NO: 8 and a Cas12a nuclease. A donor DNA template of SEQ ID NO:11 or the equivalent thereof having longer or shorter homology arms canbe used to obtain the CgRRS insertion in the 3′ junction polynucleotidethat is set forth in SEQ ID NO: 17. An INHT26 transgenic locuscontaining this CgRRS insertion is set forth in SEQ ID NO: 14.

Also provided herein are allelic variants of any of the INHT26transgenic loci or DNA molecules provided herein. In certainembodiments, such allelic variants of INHT26 transgenic loci includesequences having at least 85%, 90%, 95%, 98%, or 99% sequence identityacross the entire length or at least 20, 40, 100, 500, 1,000, 2,000,4,000, 6,000, 8,000, 9,000, 10,000, 11,000, 12,000, 13,000, or 13,659nucleotides of SEQ ID NO: 14. In certain embodiments, such allelicvariants of INHT26 DNA molecules include sequences having at least 85%,90%, 95%, 98%, or 99% sequence identity across the entire length of SEQID NO: 14, 16, or 17.

Also provided are unique transgenic locus excision sites created byexcision of INHT26 transgenic loci or selectively excisable INHT26transgenic loci, DNA molecules comprising the INHT26 transgenic loci orunique fragments thereof (i.e., fragments of an INHT26 locus which arenot found in an DAS44406-6 transgenic locus), INHT26 plants comprisingthe same, biological samples containing the DNA, nucleic acid markersadapted for detecting the DNA molecules, and related methods ofidentifying soybean plants comprising unique INHT26 transgenic locusexcision sites and unique fragments of a INHT26 transgenic locus. Anexample of such an excision site would include an excision site createdby excising the INHT26 transgenic locus with a guide RNA encoded by SEQID NO: 19 and a suitable Cas RdDe (e.g., a Cas12a nuclease of SEQ ID NO:15). DNA molecules comprising unique fragments of an INHT26 transgeniclocus are diagnostic for the presence of an INHT26 transgenic locus orfragments thereof in a soybean plant, soybean cell, soybean seed,products obtained therefrom (e.g., seed meal or stover), and biologicalsamples. DNA molecules comprising unique fragments of an INHT26transgenic locus include DNA molecules comprising the CgRRS include SEQID NO: 17.

Methods provided herein can be used to excise any transgenic locus wherethe first and second junction sequences comprising the endogenousnon-transgenic genomic DNA and the heterologous transgenic DNA which arejoined at the site of transgene insertion in the plant genome are knownor have been determined. In certain embodiments provided herein,transgenic loci can be removed from crop plant lines to obtain cropplant lines with tailored combinations of transgenic loci and optionallytargeted genetic changes. Such first and second junction sequences arereadily identified in new transgenic events by inverse PCR techniquesusing primers which are complementary the inserted transgenic sequences.In certain embodiments, the first and second junction sequences oftransgenic loci are published. An example of a transgenic locus whichcan be improved and used in the methods provided herein is the soybeanDAS44406-6 transgenic locus. The soybean DAS44406-6 transgenic locus andits transgenic junction sequences are also depicted in FIG. 1 . Soybeanplants comprising the DAS44406-6 transgenic locus and seed thereof havebeen cultivated, been placed in commerce, and have been described in avariety of publications by various governmental bodies. Databases whichhave compiled descriptions of the DAS44406-6 transgenic locus includethe International Service for the Acquisition of Agri-biotechApplications (ISAAA) database (available on the world wide web internetsite “isaaa.org/gmapprovaldatabase/event”), the GenBit LLC database(available on the world wide web internet site“genbitgroup.com/en/gmo/gmodatabase”), and the Biosafety Clearing-House(BCH) database (available on the http internet site“bch.cbd.int/database/organisms”).

Sequences of the junction polynucleotides as well as the transgenicinsert(s) of the DAS44406-6 transgenic locus which can be improved bythe methods provided herein are set forth or otherwise provided in SEQID NO: 1, US 9,540,655, the sequence of the DAS44406-6 locus in thedeposited seed of ATCC accession No. PTA-11336, and elsewhere in thisdisclosure. In certain embodiments provided herein, the DAS44406-6transgenic locus set forth in SEQ ID NO: 1 or present in the depositedseed of ATCC accession No. PTA-11336 is referred to as an “originalDAS44406-6 transgenic locus.” Allelic or other variants of the sequenceset forth SEQ ID NO: 1, the patent references set forth therein andincorporated herein by reference in their entireties, and elsewhere inthis disclosure which may be present in certain variant DAS44406-6transgenic plant loci (e.g., progeny of deposited seed of accession No.PTA-11336 which contain allelic variants of SEQ ID NO:1 or progenyoriginating from transgenic plant cells comprising the original MIR162transgenic set forth in US 9,540,655) can also be improved byidentifying sequences in the variants that correspond to the SEQ ID NO:1 by performing a pairwise alignment (e.g., using CLUSTAL O 1.2.4 withdefault parameters) and making corresponding changes in the allelic orother variant sequences. Such allelic or other variant sequences includesequences having at least 85%, 90%, 95%, 98%, or 99% sequence identityacross the entire length or at least 20, 40, 100, 500, 1,000, 2,000,4,000, 8,000, 10,000, 11,000, 12,000, 13,000 or 13659 nucleotides of SEQID NO: 1. Also provided are plants, plant parts including seeds, genomicDNA, and/or DNA obtained from INHT26 plants which comprise one or moremodifications (e.g., via insertion of a CgRRS in a junctionpolynucleotide sequence) which provide for selective excision of theINHT26 transgenic locus or a portion thereof. Also provided herein aremethods of detecting plants, genomic DNA, and/or DNA obtained fromplants comprising a INHT26 transgenic locus which contains one or moreof a CgRRS, deletions of selectable marker genes, deletions ofnon-essential DNA, and/or a transgenic locus excision site. A firstjunction polynucleotide of a DAS44406-6 transgenic locus can compriseeither one of the junction polynucleotides found at the 5′ end or the 3′end of any one of the sequences set forth in SEQ ID NO: 1, allelicvariants thereof, or other variants thereof. An OgRRS can be foundwithin non-transgenic DNA, transgenic DNA, or a combination thereof ineither one of the junction polynucleotides of any one of SEQ ID NO: 1,allelic variants thereof, or other variants thereof. A second junctionpolynucleotide of a transgenic locus can comprise either one of thejunction polynucleotides found at the 5′ or 3′ end of any one of thesequences set forth in SEQ ID NO: 1, allelic variants thereof, or othervariants thereof. A CgRRS can be introduced within transgenic,non-transgenic DNA, or a combination thereof of either one of thejunction polynucleotides of any one of SEQ ID NO: 1, allelic variantsthereof, or other variants thereof to obtain an INHT26 transgenic locus.In certain embodiments, the OgRRS is found in non-transgenic DNA ortransgenic DNA of the 5′ junction polynucleotide of a transgenic locusof any one of SEQ ID NO: 1, allelic variants thereof, or other variantsthereof and the corresponding CgRRS is introduced into the transgenicDNA, non-transgenic DNA, or a combination thereof in the 3′ junctionpolynucleotide of the DAS44406-6 transgenic locus of SEQ ID NO: 1,allelic variants thereof, or other variants thereof to obtain an INHT26transgenic locus. In other embodiments, the OgRRS is found innon-transgenic DNA or transgenic DNA of the 3′ junction polynucleotideof the DAS44406-6 transgenic locus of any one of SEQ ID NO: 1, allelicvariants thereof, or other variants thereof and the corresponding CgRRSis introduced into the transgenic DNA, non-transgenic DNA, or acombination thereof in the 5′ junction polynucleotide of the transgeniclocus of SEQ ID NO: 1, allelic variants thereof, or other variantsthereof to obtain an INHT26 transgenic locus.

In certain embodiments, the CgRRS is comprised in whole or in part of anexogenous DNA molecule that is introduced into a DNA junctionpolynucleotide by genome editing. In certain embodiments, the guide RNAhybridization site of the CgRRS is operably linked to a pre-existing PAMsite in the transgenic DNA or non-transgenic DNA of the transgenic plantgenome. In other embodiments, the guide RNA hybridization site of theCgRRS is operably linked to a new PAM site that is introduced in the DNAjunction polynucleotide by genome editing. A CgRRS can be located innon-transgenic plant genomic DNA of a DNA junction polynucleotide of anINHT26 transgenic locus, in transgenic DNA of a DNA junctionpolynucleotide of an INHT26 transgenic locus or can span the junction ofthe transgenic and non-transgenic DNA of a DNA junction polynucleotideof an INHT26 transgenic locus. An OgRRS can likewise be located innon-transgenic plant genomic DNA of a DNA junction polynucleotide of anINHT26 transgenic locus, in transgenic DNA of a DNA junctionpolynucleotide of an INHT26 transgenic locus, or can span the junctionof the transgenic and non-transgenic DNA of a DNA junctionpolynucleotide of an INHT26 transgenic locus

Methods provided herein can be used in a variety of breeding schemes toobtain elite crop plants comprising subsets of desired modifiedtransgenic loci comprising an OgRRS and a CgRRS operably linked tojunction polynucleotide sequences and transgenic loci excision siteswhere undesired transgenic loci or portions thereof have been removed(e.g., by use of the OgRRS and a CgRRS). Such methods are useful atleast insofar as they allow for production of distinct useful donorplant lines each having unique sets of modified transgenic loci and, insome instances, targeted genetic changes that are tailored for distinctgeographies and/or product offerings. In an illustrative andnon-limiting example, a different product lines comprising transgenicloci conferring only two of three types of herbicide tolerance (e.g..,glyphosate, glufosinate, and dicamba) can be obtained from a singledonor line comprising three distinct transgenic loci conferringresistance to all three herbicides. In certain aspects, plantscomprising the subsets of undesired transgenic loci and transgenic lociexcision sites can further comprise targeted genetic changes. Such elitecrop plants can be inbred plant lines or can be hybrid plant lines. Incertain embodiments, at least two transgenic loci (e.g., transgenic lociincluding an INHT26 and another modified transgenic locus wherein anOgRRS and a CgRRS site is operably linked to a first and a secondjunction sequence and optionally a selectable marker gene and/ornon-essential DNA are deleted) are introgressed into a desired donorline comprising elite crop plant germplasm and then subjected to genomeediting molecules to recover plants comprising one of the twointrogressed transgenic loci as well as a transgenic loci excision siteintroduced by excision of the other transgenic locus or portion thereofby the genome editing molecules. In certain embodiments, the genomeediting molecules can be used to remove a transgenic locus and introducetargeted genetic changes in the crop plant genome. Introgression can beachieved by backcrossing plants comprising the transgenic loci to arecurrent parent comprising the desired elite germplasm and selectingprogeny with the transgenic loci and recurrent parent germplasm. Suchbackcrosses can be repeated and/or supplemented by molecular assistedbreeding techniques using SNP or other nucleic acid markers to selectfor recurrent parent germplasm until a desired recurrent parentpercentage is obtained (e.g., at least about 95%, 96%, 97%, 98%, or 99%recurrent parent percentage). A non-limiting, illustrative depiction ofa scheme for obtaining plants with both subsets of transgenic loci andthe targeted genetic changes is shown in the FIG. 3 (bottom“Alternative” panel), where two or more of the transgenic loci (“Event”in FIG. 3 ) are provided in Line A and then moved into elite crop plantgermplasm by introgression. In the non-limiting FIG. 3 illustration,introgression can be achieved by crossing a “Line A” comprising two ormore of the modified transgenic loci to the elite germplasm and thenbackcrossing progeny of the cross comprising the transgenic loci to theelite germplasm as the recurrent parent) to obtain a “Universal Donor”(e.g., Line A+ in FIG. 3 ) comprising two or more of the modifiedtransgenic loci. This elite germplasm containing the modified transgenicloci (e.g., “Universal Donor” of FIG. 3 ) can then be subjected togenome editing molecules which can excise at least one of the transgenicloci (“Event Removal” in FIG. 3 ) and introduce other targeted geneticchanges (“GE” in FIG. 3 ) in the genomes of the elite crop plantscontaining one of the transgenic loci and a transgenic locus excisionsite corresponding to the removal site of one of the transgenic loci.Such selective excision of transgenic loci or portion thereof can beeffected by contacting the genome of the plant comprising two transgenicloci with gene editing molecules (e.g., RdDe and gRNAs, TALENS, and/orZFN) which recognize one transgenic loci but not another transgenicloci. Genome editing molecules that provide for selective excision of afirst modified transgenic locus comprising an OgRRS and a CgRRS includea gRNA that hybridizes to the OgRRS and CgRRS of the first modifiedtransgenic locus and an RdDe that recognizes the gRNA/OgRRS andgRNA/CgRRS complexes. Distinct plant lines with different subsets oftransgenic loci and desired targeted genetic changes are thus recovered(e.g., “Line B-1,” “Line B-2,” and “Line B-3” in FIG. 3 ). In certainembodiments, it is also desirable to bulk up populations of inbred elitecrop plants or their seed comprising the subset of transgenic loci and atransgenic locus excision site by selfing. In certain embodiments,inbred progeny of the selfed soybean plants comprising the INHT26transgenic loci can be used as a pollen donor or recipient for hybridseed production. Such hybrid seed and the progeny grown therefrom cancomprise a subset of desired transgenic loci and a transgenic lociexcision site.

Hybrid plant lines comprising elite crop plant germplasm, at least onetransgenic locus and at least one transgenic locus excision site, and incertain aspects, additional targeted genetic changes are also providedherein. Methods for production of such hybrid seed can comprise crossingelite crop plant lines where at least one of the pollen donor orrecipient comprises at least the transgenic locus and a transgenic locusexcision site and/or additional targeted genetic changes. In certainembodiments, the pollen donor and recipient will comprise germplasm ofdistinct heterotic groups and provide hybrid seed and plants exhibitingheterosis. In certain embodiments, the pollen donor and recipient caneach comprise a distinct transgenic locus which confers either adistinct trait (e.g., herbicide tolerance or insect resistance), adifferent type of trait (e.g., tolerance to distinct herbicides or todistinct insects such as coleopteran or lepidopteran insects), or adifferent mode-of-action for the same trait (e.g., resistance tocoleopteran insects by two distinct modes-of-action or resistance tolepidopteran insects by two distinct modes-of-action). In certainembodiments, the pollen recipient will be rendered male sterile orconditionally male sterile. Methods for inducing male sterility orconditional male sterility include emasculation (e.g., detasseling),cytoplasmic male sterility, chemical hybridizing agents or systems, atransgenes or transgene systems, and/or mutation(s) in one or moreendogenous plant genes. Descriptions of various male sterility systemsthat can be adapted for use with the elite crop plants provided hereinare described in Wan et al. Molecular Plant; 12, 3, (2019):321-342 aswell as in US 8,618,358; US 20130031674; and US 2003188347.

In certain embodiments, edited transgenic plant genomes, transgenicplant cells, parts, or plants containing those genomes, and DNAmolecules obtained therefrom, can comprise a desired subset oftransgenic loci and/or comprise at least one transgenic locus excisionsite. In certain embodiments, a segment comprising an INHT26 transgeniclocus comprising an OgRRS in non-transgenic DNA of a 1^(st) junctionpolynucleotide sequence and a CgRRS in a 2^(nd) junction polynucleotidesequence is deleted with a gRNA and RdDe that recognize the OgRRS andthe CgRRS to produce an INHT26 transgenic locus excision site. Forexample, an INHT26 transgenic locus set forth in SEQ ID NO: 14 can bedeleted with a Cas12a RdDe (e.g. the Cas12a of SEQ ID NO: 15) and a gRNAcomprising an RNA encoded by SEQ ID NO: 19. In certain embodiments, thetransgenic locus excision site can comprise a contiguous segment of DNAcomprising at least 10 base pairs of DNA that is telomere proximal tothe deleted segment of the transgenic locus and at least 10 base pairsof DNA that is centromere proximal to the deleted segment of thetransgenic locus wherein the transgenic DNA (i.e., the heterologous DNA)that has been inserted into the crop plant genome has been deleted. Incertain embodiments where a segment comprising a transgenic locus hasbeen deleted, the transgenic locus excision site can comprise acontiguous segment of DNA comprising at least 10 base pairs DNA that istelomere proximal to the deleted segment of the transgenic locus and atleast 10 base pairs of DNA that is centromere proximal DNA to thedeleted segment of the transgenic locus wherein the heterologoustransgenic DNA and at least 1, 2, 5, 10, 20, 50, or more base pairs ofendogenous DNA located in a 5′ junction sequence and/or in a 3′ junctionsequence of the original transgenic locus that has been deleted. In suchembodiments where DNA comprising the transgenic locus is deleted, atransgenic locus excision site can comprise at least 10 base pairs ofDNA that is telomere proximal to the deleted segment of the transgeniclocus and at least 10 base pairs of DNA that is centromere proximal tothe deleted segment of the transgenic locus wherein all of thetransgenic DNA is absent and either all or less than all of theendogenous DNA flanking the transgenic DNA sequences are present. Incertain embodiments where a segment consisting essentially of anoriginal transgenic locus has been deleted, the transgenic locusexcision site can be a contiguous segment of at least 10 base pairs ofDNA that is telomere proximal to the deleted segment of the transgeniclocus and at least 10 base pairs of DNA that is centromere proximal tothe deleted segment of the transgenic locus wherein less than all of theheterologous transgenic DNA that has been inserted into the crop plantgenome is excised. In certain aforementioned embodiments where a segmentconsisting essentially of an original transgenic locus has been deleted,the transgenic locus excision site can thus contain at least 1 base pairof DNA or 1 to about 2 or 5, 8, 10, 20, or 50 base pairs of DNAcomprising the telomere proximal and/or centromere proximal heterologoustransgenic DNA that has been inserted into the crop plant genome. Incertain embodiments where a segment consisting of an original transgeniclocus has been deleted, the transgenic locus excision site can contain acontiguous segment of DNA comprising at least 10 base pairs of DNA thatis telomere proximal to the deleted segment of the transgenic locus andat least 10 base pairs of DNA that is centromere proximal to the deletedsegment of the transgenic locus wherein the heterologous transgenic DNAthat has been inserted into the crop plant genome is deleted. In certainembodiments where DNA consisting of the transgenic locus is deleted, atransgenic locus excision site can comprise at least 10 base pairs ofDNA that is telomere proximal to the deleted segment of the transgeniclocus and at least 10 base pairs of DNA that is centromere proximal tothe deleted segment of the transgenic locus wherein all of theheterologous transgenic DNA that has been inserted into the crop plantgenome is deleted and all of the endogenous DNA flanking theheterologous sequences of the transgenic locus is present. In any of theaforementioned embodiments or in other embodiments, the continuoussegment of DNA comprising the transgenic locus excision site can furthercomprise an insertion of 1 to about 2, 5, 10, 20, or more nucleotidesbetween the DNA that is telomere proximal to the deleted segment of thetransgenic locus and the DNA that is centromere proximal to the deletedsegment of the transgenic locus. Such insertions can result either fromendogenous DNA repair and/or recombination activities at the doublestranded breaks introduced at the excision site and/or from deliberateinsertion of an oligonucleotide. Plants, edited plant genomes,biological samples, and DNA molecules (e.g., including isolated orpurified DNA molecules) comprising the INHT26 transgenic loci excisionsites are provided herein.

In other embodiments, a segment comprising a INHT26 transgenic locus(e.g., a transgenic locus comprising an OgRRS in non-transgenic DNA of a1^(st) junction sequence and a CgRRS in a 2^(nd) junction sequence) canbe deleted with a gRNA and RdDe that recognize the OgRRS and the CgRRS(e.g., the Cas12a RdDe of SEQ ID NO: 15 and a gRNA comprising an RNAencoded by SEQ ID NO: 19) and replaced with DNA comprising theendogenous non-transgenic plant genomic DNA present in the genome priorto transgene insertion. A non-limiting example of such replacements canbe visualized in FIG. 4C, where the donor DNA template can comprise theendogenous non-transgenic plant genomic DNA present in the genome priorto transgene insertion along with sufficient homology to non-transgenicDNA on each side of the excision site to permit homology-directedrepair. In certain embodiments, the endogenous non-transgenic plantgenomic DNA present in the genome prior to transgene insertion can be atleast partially restored. In certain embodiments, the endogenousnon-transgenic plant genomic DNA present in the genome prior totransgene insertion can be essentially restored such that no more thanabout 5, 10, or 20 to about 50, 80, or 100 nucleotides are changedrelative to the endogenous DNA at the essentially restored excisionsite.

In certain embodiments, edited transgenic plant genomes and transgenicplant cells, plant parts, or plants containing those edited genomes,comprising a modification of an original transgenic locus, where themodification comprises an OgRRS and a CgRRS which are operably linked toa 1^(st) and a 2^(nd) junction sequence, respectively or irrespectively,and optionally further comprise a deletion of a segment of the originaltransgenic locus. In certain embodiments, the modification comprises twoor more separate deletions and/or there is a modification in two or moreoriginal transgenic plant loci. In certain embodiments, the deletedsegment comprises, consists essentially of, or consists of a segment ofnon-essential DNA in the transgenic locus. Illustrative examples ofnon-essential DNA include but are not limited to synthetic cloning sitesequences, duplications of transgene sequences; fragments of transgenesequences, and Agrobacterium right and/or left border sequences. Incertain embodiments, the non-essential DNA is a duplication and/orfragment of a promoter sequence and/or is not the promoter sequenceoperably linked in the cassette to drive expression of a transgene. Incertain embodiments, excision of the non-essential DNA improves acharacteristic, functionality, and/or expression of a transgene of thetransgenic locus or otherwise confers a recognized improvement in atransgenic plant comprising the edited transgenic plant genome. Incertain embodiments, the non-essential DNA does not comprise DNAencoding a selectable marker gene. In certain embodiments of an editedtransgenic plant genome, the modification comprises a deletion of thenon-essential DNA and a deletion of a selectable marker gene. Themodification producing the edited transgenic plant genome could occur byexcising both the non-essential DNA and the selectable marker gene atthe same time, e.g., in the same modification step, or the modificationcould occur step-wise. For example, an edited transgenic plant genome inwhich a selectable marker gene has previously been removed from thetransgenic locus can comprise an original transgenic locus from which anon-essential DNA is further excised and vice versa. In certainembodiments, the modification comprising deletion of the non-essentialDNA and deletion of the selectable marker gene comprises excising asingle segment of the original transgenic locus that comprises both thenon-essential DNA and the selectable marker gene. Such modificationwould result in one excision site in the edited transgenic genomecorresponding to the deletion of both the non-essential DNA and theselectable marker gene. In certain embodiments, the modificationcomprising deletion of the non-essential DNA and deletion of theselectable marker gene comprises excising two or more segments of theoriginal transgenic locus to achieve deletion of both the non-essentialDNA and the selectable marker gene. Such modification would result in atleast two excision sites in the edited transgenic genome correspondingto the deletion of both the non-essential DNA and the selectable markergene. In certain embodiments of an edited transgenic plant genome, priorto excision, the segment to be deleted is flanked by operably linkedprotospacer adjacent motif (PAM) sites in the original or unmodifiedtransgenic locus and/or the segment to be deleted encompasses anoperably linked PAM site in the original or unmodified transgenic locus.In certain embodiments, following excision of the segment, the resultingedited transgenic plant genome comprises PAM sites flanking the deletionsite in the modified transgenic locus. In certain embodiments of anedited transgenic plant genome, the modification comprises amodification of a DAS44406-6 transgenic locus.

In certain embodiments, improvements in a transgenic plant locus areobtained by introducing a new cognate guide RNA recognition site (CgRRS)which is operably linked to a DNA junction polynucleotide of thetransgenic locus in the transgenic plant genome. Such CgRRS sites can berecognized by RdDe and a single suitable guide RNA directed to the CgRRSand the originator gRNA Recognition Site (OgRRS) to provide for cleavagewithin the junction polynucleotides which flank an INHT26 transgeniclocus. In certain embodiments, the CgRRS/gRNA and OgRRS/gRNAhybridization complexes are recognized by the same class of RdDe (e.g.,Class 2 type II or Class 2 type V) or by the same RdDe (e.g., both theCgRRS/gRNA and OgRRS/gRNA hybridization complexes recognized by the sameCas9 or Cas 12 RdDe). Such CgRRS and OgRRS can be recognized by RdDe andsuitable guide RNAs containing crRNA sufficiently complementary to theguide RNA hybridization site DNA sequences adjacent to the PAM site ofthe CgRRS and the OgRRS to provide for cleavage within or near the twojunction polynucleotides. Suitable guide RNAs can be in the form of asingle gRNA comprising a crRNA or in the form of a crRNA/tracrRNAcomplex. In the case of the OgRRS site, the PAM and guide RNAhybridization site are endogenous DNA polynucleotide molecules found inthe plant genome. In certain embodiments where the CgRRS is introducedinto the plant genome by genome editing, gRNA hybridization sitepolynucleotides introduced at the CgRRS are at least 17 or 18nucleotides in length and are complementary to the crRNA of a guide RNA.In certain embodiments, the gRNA hybridization site sequence of theOgRRS and/or the CgRRS is about 17 or 18 to about 24 nucleotides inlength. The gRNA hybridization site sequence of the OgRRS and the gRNAhybridization site of the CgRRS can be of different lengths or comprisedifferent sequences so long as there is sufficient complementarity topermit hybridization by a single gRNA and recognition by a RdDe thatrecognizes and cleaves DNA at the gRNA/OgRRS and gRNA/CgRRS complex. Incertain embodiments, the guide RNA hybridization site of the CgRRScomprise about a 17 or 18 to about 24 nucleotide sequence which isidentical to the guide RNA hybridization site of the OgRRS. In otherembodiments, the guide RNA hybridization site of the CgRRS compriseabout a 17 or 18 to about 24 nucleotide sequence which has one, two,three, four, or five nucleotide insertions, deletions or substitutionswhen compared to the guide RNA hybridization site of the OgRRS. CertainCgRRS comprising a gRNA hybridization site containing has one, two,three, four, or five nucleotide insertions, deletions or substitutionswhen compared to the guide RNA hybridization site of the OgRRS canundergo hybridization with a gRNA which is complementary to the OgRRSgRNA hybridization site and be cleaved by certain RdDe. Examples ofmismatches between gRNAs and guide RNA hybridization sites which allowfor RdDe recognition and cleavage include mismatches resulting from bothnucleotide insertions and deletions in the DNA which is hybridized tothe gRNA (e.g., Lin et al., doi: 10.1093/nar/gku402). In certainembodiments, an operably linked PAM site is co-introduced with the gRNAhybridization site polynucleotide at the CgRRS. In certain embodiments,the gRNA hybridization site polynucleotides are introduced at a positionadjacent to a resident endogenous PAM sequence in the junctionpolynucleotide sequence to form a CgRRS where the gRNA hybridizationsite polynucleotides are operably linked to the endogenous PAM site. Incertain embodiments, non-limiting features of the OgRRS, CgRRS, and/orthe gRNA hybridization site polynucleotides thereof include: (i) absenceof significant homology or sequence identity (e.g., less than 50%sequence identity across the entire length of the OgRRS, CgRRS, and/orthe gRNA hybridization site sequence) to any other endogenous ortransgenic sequences present in the transgenic plant genome or in othertransgenic genomes of the soybean plant being transformed and edited;(ii) absence of significant homology or sequence identity (e.g., lessthan 50% sequence identity across the entire length of the sequence) ofa sequence of a first OgRRS and a first CgRRS to a second OgRRS and asecond CgRRS which are operably linked to junction polynucleotides of adistinct transgenic locus; (iii) the presence of some sequence identity(e.g., about 25%, 40%, or 50% to about 60%, 70%, or 80%) between theOgRRS sequence and endogenous sequences present at the site where theCgRRS sequence is introduced; and/or (iv) optimization of the gRNAhybridization site polynucleotides for recognition by the RdDe and guideRNA when used in conjunction with a particular PAM sequence. In certainembodiments, the first and second OgRRS as well as the first and secondCgRRS are recognized by the same class of RdDe (e.g., Class 2 type II orClass 2 type V) or by the same RdDe (e.g., Cas9 or Cas 12 RdDe). Incertain embodiments, the first OgRRS site in a first junctionpolynucleotide and the CgRRS introduced in the second junctionpolynucleotide to permit excision of a first transgenic locus by a firstsingle guide RNA and a single RdDe. Such nucleotide insertions or genomeedits used to introduce CgRRS in a transgenic plant genome can beeffected in the plant genome by using gene editing molecules (e.g., RdDeand guide RNAs, RNA dependent nickases and guide RNAs, Zinc Fingernucleases or nickases, or TALE nucleases or nickases) which introduceblunt double stranded breaks or staggered double stranded breaks in theDNA junction polynucleotides. In the case of DNA insertions, the genomeediting molecules can also in certain embodiments further comprise adonor DNA template or other DNA template which comprises theheterologous nucleotides for insertion to form the CgRRS. Guide RNAs canbe directed to the junction polynucleotides by using a pre-existing PAMsite located within or adjacent to a junction polynucleotide of thetransgenic locus. Non-limiting examples of such pre-existing PAM sitespresent in junction polynucleotides, which can be used either inconjunction with an inserted heterologous sequence to form a CgRRS orwhich can be used to create a double stranded break to insert or createa CgRRS, include PAM sites recognized by a Cas12a enzyme. Non-limitingexamples where a CgRRS is created in a DNA sequence are illustrated inExample 2 and FIG. 2 .

Transgenic loci comprising OgRRS and CgRRS in a first and a secondjunction polynucleotides can be excised from the genomes of transgenicplants by contacting the transgenic loci with RdDe or RNA directednickases, and a suitable guide RNA directed to the OgRRS and CgRRS(e.g., the Cas12a RdDe of SEQ ID NO: 15 and a gRNA comprising an RNAencoded by SEQ ID NO: 19). A non-limiting example where a modifiedtransgenic locus is excised from a plant genome by use of a gRNA and anRdDe that recognizes an OgRRS/gRNA and a CgRRS/gRNA complex andintroduces dsDNA breaks in both junction polynucleotides and repaired byNHEJ is depicted in FIG. 4B. In the depicted example set forth in FIG.4B, the OgRRS site and the CgRRS site are absent from the plantchromosome comprising the transgene excision site that results from theprocess. In other embodiments provided herein where a modifiedtransgenic locus is excised from a plant genome by use of a gRNA and anRdDe that recognizes an OgRRS/gRNA and a CgRRS/gRNA complex and repairedby NHEJ or microhomology-mediated end joining (MMEJ), the OgRRS and/orother non-transgenic sequences that were originally present prior totransgene insertion are partially or essentially restored.

In certain embodiments, edited transgenic plant genomes provided hereincan lack one or more selectable markers found in an original event(transgenic locus). Original DAS44406-6 transgenic loci (events),including those set forth in SEQ ID NO: 1), US 9,540,655, the sequenceof the DAS44406-6 locus in the deposited seed of accession No. PTA-11336and progeny thereof, contain a selectable marker gene encoding aphosphinotricin acetyl transferase (PAT) protein which confers toleranceto the herbicide glufosinate. In certain embodiments provided herein,the DNA element comprising, consisting essentially of, or consisting ofthe PAT selectable marker gene of an DAS44406-6 transgenic locus isabsent from an INHT26 transgenic locus. The PAT selectable markercassette can be excised from an original DAS44406-6 transgenic locus bycontacting the transgenic locus with one or more gene editing moleculeswhich introduce double stranded breaks in the transgenic locus at the 5′and 3′ end of the expression cassette comprising the PAT selectablemarker transgene (e.g., an RdDe and guide RNAs directed to PAM siteslocated at the 5′ and 3′ end of the expression cassette comprising thePAT selectable marker transgene) and selecting for plant cells, plantparts, or plants wherein the selectable marker has been excised. Incertain embodiments, the selectable or scoreable marker transgene can beinactivated. Inactivation can be achieved by modifications includinginsertion, deletion, and/or substitution of one or more nucleotides in apromoter element, 5′ or 3′ untranslated region (UTRs), intron, codingregion, and/or 3′ terminator and/or polyadenylation site of theselectable marker transgene. Such modifications can inactivate theselectable marker transgene by eliminating or reducing promoteractivity, introducing a missense mutation, and/or introducing apre-mature stop codon. In certain embodiments, the selectable PAT markertransgene can be replaced by an introduced transgene. In certainembodiments, an original transgenic locus that was contacted with geneediting molecules which introduce double stranded breaks in thetransgenic locus at the 5′ and 3′ end of the expression cassettecomprising the PAT selectable marker transgene can also be contactedwith a suitable donor DNA template comprising an expression cassetteflanked by DNA homologous to remaining DNA in the transgenic locuslocated 5′ and 3′ to the selectable marker excision site. In certainembodiments, a coding region of the PAT selectable marker transgene canbe replaced with another coding region such that the replacement codingregion is operably linked to the promoter and 3′ terminator orpolyadenylation site of the PAT selectable marker transgene.

In certain embodiments, edited transgenic plant genomes provided hereincan comprise additional new introduced transgenes (e.g., expressioncassettes) inserted into the transgenic locus of a given event.Introduced transgenes inserted at the transgenic locus of an eventsubsequent to the event’s original isolation can be obtained by inducinga double stranded break at a site within an original transgenic locus(e.g., with genome editing molecules including an RdDe and suitableguide RNA(s); a suitable engineered zinc-finger nuclease; a TALENprotein and the like) and providing an exogenous transgene in a donorDNA template which can be integrated at the site of the double strandedbreak (e.g. by homology-directed repair (HDR) or by non-homologousend-joining (NHEJ)). In certain embodiments, an OgRRS and a CgRRSlocated in a 1^(st) junction polynucleotide and a 2^(nd) junctionpolynucleotide, respectively, can be used to delete the transgenic locusand replace it with one or more new expression cassettes. In certainembodiments, such deletions and replacements are effected by introducingdsDNA breaks in both junction polynucleotides and providing the newexpression cassettes on a donor DNA template (e.g., in FIG. 4C, thedonor DNA template can comprise an expression cassette flanked by DNAhomologous to non-transgenic DNA located telomere proximal andcentromere proximal to the excision site). Suitable expression cassettesfor insertion include DNA molecules comprising promoters which areoperably linked to DNA encoding proteins and/or RNA molecules whichconfer useful traits which are in turn operably linked topolyadenylation sites or terminator elements. In certain embodiments,such expression cassettes can also comprise 5′ UTRs, 3′ UTRs, and/orintrons. Useful traits include biotic stress tolerance (e.g., insectresistance, nematode resistance, or disease resistance), abiotic stresstolerance (e.g., heat, cold, drought, and/or salt tolerance), herbicidetolerance, and quality traits (e.g., improved fatty acid compositions,protein content, starch content, and the like). Suitable expressioncassettes for insertion include expression cassettes which confer insectresistance, herbicide tolerance, biofuel use, or male sterility traitscontained in any of the transgenic events set forth in U.S. Pat.Application Public. Nos. 20090038026, 20130031674, 20150361446,20170088904, 20150267221, 201662346688, and 20200190533 as well as inU.S. Pat. Nos. 6342660, 7323556, 6040497, 8759618, 7157281, 6852915,7705216, 10316330, 8618358, 8450561, 8212113, 9428765, 7897748, 8273959,8093453,8901378, 9994863, 7928296, and 8466346, each of which areincorporated herein by reference in their entireties.

In certain embodiments, INHT26 plants provided herein, including plantswith one or more transgenic loci, modified transgenic loci, and/orcomprising transgenic loci excision sites can further comprise one ormore targeted genetic changes introduced by one or more of gene editingmolecules or systems. Also provided are methods where the targetedgenetic changes are introduced and one or more transgenic loci areremoved from plants either in series or in parallel (e.g., as set forthin the non-limiting illustration in FIG. 3 , bottom “Alternative” panel,where “GE” can represent targeted genetic changes induced by geneediting molecules and “Event Removal” represents excision of one or moretransgenic loci with gene editing molecules). Such targeted geneticchanges include those conferring traits such as improved yield, improvedfood and/or feed characteristics (e.g., improved oil, starch, protein,or amino acid quality or quantity), improved nitrogen use efficiency,improved biofuel use characteristics (e.g., improved ethanolproduction), male sterility/conditional male sterility systems (e.g., bytargeting endogenous MS26, MS45 and MSCA1 genes), herbicide tolerance(e.g., by targeting endogenous ALS, EPSPS, HPPD, or other herbicidetarget genes), delayed flowering, non-flowering, increased biotic stressresistance (e.g., resistance to insect, nematode, bacterial, or fungaldamage), increased abiotic stress resistance (e.g., resistance todrought, cold, heat, metal, or salt ), enhanced lodging resistance,enhanced growth rate, enhanced biomass, enhanced tillering, enhancedbranching, delayed flowering time, delayed senescence, increased flowernumber, improved architecture for high density planting, improvedphotosynthesis, increased root mass, increased cell number, improvedseedling vigor, improved seedling size, increased rate of cell division,improved metabolic efficiency, and increased meristem size in comparisonto a control plant lacking the targeted genetic change. Types oftargeted genetic changes that can be introduced include insertions,deletions, and substitutions of one or more nucleotides in the cropplant genome. Sites in endogenous plant genes for the targeted geneticchanges include promoter, coding, and non-coding regions (e.g., 5′ UTRs,introns, splice donor and acceptor sites and 3′ UTRs). In certainembodiments, the targeted genetic change comprises an insertion of aregulatory or other DNA sequence in an endogenous plant gene.Non-limiting examples of regulatory sequences which can be inserted intoendogenous plant genes with gene editing molecules to effect targetedgenetic changes which confer useful phenotypes include those set forthin U.S. Pat. Application Publication 20190352655, which is incorporatedherein by reference in its entirety, such as: (a) auxin response element(AuxRE) sequence; (b) at least one D1-4 sequence (Ulmasov et al. (1997)Plant Cell, 9:1963-1971), (c) at least one DR5 sequence (Ulmasov et al.(1997) Plant Cell, 9:1963-1971); (d) at least one m5-DR5 sequence(Ulmasov et al. (1997) Plant Cell, 9:1963-1971); (e) at least one P3sequence; (f) a small RNA recognition site sequence bound by acorresponding small RNA (e.g., an siRNA, a microRNA (miRNA), atrans-acting siRNA as described in U.S. Pat. No. 8,030,473, or a phasedsRNA as described in U.S. Pat. No. 8,404,928; both of these citedpatents are incorporated by reference herein); (g) a microRNA (miRNA)recognition site sequence; (h) the sequence recognizable by a specificbinding agent includes a microRNA (miRNA) recognition sequence for anengineered miRNA wherein the specific binding agent is the correspondingengineered mature miRNA; (i) a transposon recognition sequence; (j) asequence recognized by an ethylene-responsive elementbinding-factor-associated amphiphilic repression (EAR) motif; (k) asplice site sequence (e.g., a donor site, a branching site, or anacceptor site; see, for example, the splice sites and splicing signalsset forth in the internet sitelemur[dot]amu[dot]edu[dot]pl/share/ERISdb/home.html); (1) a recombinaserecognition site sequence that is recognized by a site-specificrecombinase; (m) a sequence encoding an RNA or amino acid aptamer or anRNA riboswitch, the specific binding agent is the corresponding ligand,and the change in expression is upregulation or downregulation; (n) ahormone responsive element recognized by a nuclear receptor or ahormone-binding domain thereof; (o) a transcription factor bindingsequence; and (p) a polycomb response element (see Xiao et al. (2017)Nature Genetics, 49:1546-1552, doi: 10.1038/ng.3937). Non limitingexamples of target soybean genes that can be subjected to targeted geneedits to confer useful traits include: (a) ZmIPK1 (herbicide tolerantand phytate reduced soybean; Shukla et al., Nature. 2009;459:437-41);(b) ZmGL2 (reduced epicuticular wax in leaves; Char et al. PlantBiotechnol J. 2015;13:1002); (c) ZmMTL (induction of haploid plants;Kelliher et al. Nature. 2017;542:105); (d) Wx1 (high amylopectincontent; US 20190032070; incorporated herein by reference in itsentirety); (e) TMS5 (thermosensitive male sterile; Li et al. J GenetGenomics. 2017;44:465-8); (f) ALS (herbicide tolerance; Svitashev etal.; Plant Physiol. 2015;169:931-45); and (g) ARGOS8 (drought stresstolerance; Shi et al., Plant Biotechnol J. 2017;15:207-16). Non-limitingexamples of target genes in crop plants including soybean which can besubjected to targeted genetic changes which confer useful phenotypesinclude those set forth in U.S. Pat. Application Nos. 20190352655,20200199609, 20200157554, and 20200231982, which are each incorporatedherein in their entireties; and Zhang et al. (Genome Biol. 2018; 19:210).

Gene editing molecules of use in methods provided herein includemolecules capable of introducing a double-strand break (“DSB”) orsingle-strand break (“SSB”) in double-stranded DNA, such as in genomicDNA or in a target gene located within the genomic DNA as well asaccompanying guide RNA or donor DNA template polynucleotides. Examplesof such gene editing molecules include: (a) a nuclease comprising anRNA-guided nuclease, an RNA-guided DNA endonuclease or RNA directed DNAendonuclease (RdDe), a class 1 CRISPR type nuclease system, a type IICas nuclease, a Cas9, a nCas9 nickase, a type V Cas nuclease, a Cas12anuclease, a nCas12a nickase, a Cas12d (CasY), a Cas12e (CasX), a Cas12b(C2c1), a Cas12c (C2c3), a Cas12i, a Cas12j, a Cas14, an engineerednuclease, a codon-optimized nuclease, a zinc-finger nuclease (ZFN) ornickase, a transcription activator-like effector nuclease (TAL-effectornuclease or TALEN) or nickase (TALE-nickase), an Argonaute, and ameganuclease or engineered meganuclease; (b) a polynucleotide encodingone or more nucleases capable of effectuating site-specific alteration(including introduction of a DSB or SSB) of a target nucleotidesequence; (c) a guide RNA (gRNA) for an RNA-guided nuclease, or a DNAencoding a gRNA for an RNA-guided nuclease; (d) donor DNA templatepolynucleotides; and (e) other DNA templates (dsDNA, ssDNA, orcombinations thereof) suitable for insertion at a break in genomic DNA(e.g., by non-homologous end joining (NHEJ) or microhomology-mediatedend joining (MMEJ).

CRISPR-type genome editing can be adapted for use in the plant cells andmethods provided herein in several ways. CRISPR elements, e.g., geneediting molecules comprising CRISPR endonucleases and CRISPR guide RNAsincluding single guide RNAs or guide RNAs in combination with tracrRNAsor scoutRNA, or polynucleotides encoding the same, are useful ineffectuating genome editing without remnants of the CRISPR elements orselective genetic markers occurring in progeny. In certain embodiments,the CRISPR elements are provided directly to the eukaryotic cell (e.g.,plant cells), systems, methods, and compositions as isolated molecules,as isolated or semi-purified products of a cell free synthetic process(e.g., in vitro translation), or as isolated or semi-purified productsof in a cell-based synthetic process (e.g., such as in a bacterial orother cell lysate). In certain embodiments, genome-inserted CRISPRelements are useful in plant lines adapted for use in the methodsprovide herein. In certain embodiments, plants or plant cells used inthe systems, methods, and compositions provided herein can comprise atransgene that expresses a CRISPR endonuclease (e.g., a Cas9, aCpfl-type or other CRISPR endonuclease). In certain embodiments, one ormore CRISPR endonucleases with unique PAM recognition sites can be used.Guide RNAs (sgRNAs or crRNAs and a tracrRNA) used to form an RNA-guidedendonuclease/guide RNA complex can specifically bind via hybridizationto gRNA hybridization site sequences (i.e., protospacer sequences) inthe gDNA target site that are adjacent to a protospacer adjacent motif(PAM) sequence. The type of RNA-guided endonuclease typically informsthe location of suitable PAM sites and design of crRNAs or sgRNAs.G-rich PAM sites, e.g., 5′-NGG are typically targeted for design ofcrRNAs or sgRNAs used with Cas9 proteins. Examples of PAM sequencesinclude 5′-NGG (Streptococcus pyogenes), 5′-NNAGAA (Streptococcusthermophilus CRISPR1), 5′-NGGNG (Streptococcus thermophilus CRISPR3),5′-NNGRRT or 5′-NNGRR (Staphylococcus aureus Cas9, SaCas9), and5′-NNNGATT (Neisseria meningitidis). T-rich PAM sites (e.g., 5′-TTN or5′-TTTV, where “V” is A, C, or G) are typically targeted for design ofcrRNAs or sgRNAs used with Cas12a proteins (e.g., the Cas12a protein ofSEQ ID NO: 15). In some instances, Cas12a can also recognize a 5′-CTAPAM motif. Other examples of potential Cas12a PAM sequences include TTN,CTN, TCN, CCN, TTTN, TCTN, TTCN, CTTN, ATTN, TCCN, TTGN, GTTN, CCCN,CCTN, TTAN, TCGN, CTCN, ACTN, GCTN, TCAN, GCCN, and CCGN (wherein N isdefined as any nucleotide). Cpfl (i.e., Cas12a) endonuclease andcorresponding guide RNAs and PAM sites are disclosed in US PatentApplication Publication 2016/0208243 A1, which is incorporated herein byreference for its disclosure of DNA encoding Cpfl endonucleases andguide RNAs and PAM sites. Introduction of one or more of a wide varietyof CRISPR guide RNAs that interact with CRISPR endonucleases integratedinto a plant genome or otherwise provided to a plant is useful forgenetic editing for providing desired phenotypes or traits, for traitscreening, or for gene editing mediated trait introgression (e.g., forintroducing a trait into a new genotype without backcrossing to arecurrent parent or with limited backcrossing to a recurrent parent).Multiple endonucleases can be provided in expression cassettes with theappropriate promoters to allow multiple genome site editing.

CRISPR technology for editing the genes of eukaryotes is disclosed inU.S. Pat. Application Publications 2016/0138008A1 and US2015/0344912A1,and in U.S. Pats. 8,697,359, 8,771,945, 8,945,839, 8,999,641, 8,993,233,8,895,308, 8,865,406, 8,889,418, 8,871,445, 8,889,356, 8,932,814,8,795,965, and 8,906,616. Cpfl endonuclease and corresponding guide RNAsand PAM sites are disclosed in U.S. Pat. Application Publication2016/0208243 A1. Other CRISPR nucleases useful for editing genomesinclude Cas12b and Cas12c (see Shmakov et al. (2015) Mol. Cell, 60:385 -397; Harrington et al. (2020) Molecular Celldoi:10.1016/j.molcel.2020.06.022) and CasX and CasY (see Burstein et al.(2016) Nature, doi:10.1038/nature21059; Harrington et al. (2020)Molecular Cell doi:10.1016/j.molcel.2020.06.022), or Cas12j (Pausch etal, (2020) Science 10.1126/science.abb1400). Plant RNA promoters forexpressing CRISPR guide RNA and plant codon-optimized CRISPR Cas9endonuclease are disclosed in International Patent ApplicationPCT/US2015/018104 (published as WO 2015/131101 and claiming priority toU.S. Provisional Pat. Application 61/945,700). Methods of using CRISPRtechnology for genome editing in plants are disclosed in U.S. Pat.Application Publications US 2015/0082478A1 and US 2015/0059010A1 and inInternational Patent Application PCT/US2015/038767 A1 (published as WO2016/007347 and claiming priority to U.S. Provisional Pat. Application62/023,246). All of the patent publications referenced in this paragraphare incorporated herein by reference in their entirety. In certainembodiments, an RNA-guided endonuclease that leaves a blunt endfollowing cleavage of the target site is used. Blunt-end cuttingRNA-guided endonucleases include Cas9, Cas12c, and Cas 12h (Yan et al.,2019). In certain embodiments, an RNA-guided endonuclease that leaves astaggered single stranded DNA overhanging end following cleavage of thetarget site following cleavage of the target site is used. Staggered-endcutting RNA-guided endonucleases include Cas12a, Cas12b, and Cas12e.

The methods can also use sequence-specific endonucleases orsequence-specific endonucleases and guide RNAs that cleave a single DNAstrand in a dsDNA target site. Such cleavage of a single DNA strand in adsDNA target site is also referred to herein and elsewhere as “nicking”and can be effected by various “nickases” or systems that provide fornicking. Nickases that can be used include nCas9 (Cas9 comprising a D10Aamino acid substitution), nCas12a (e.g., Cas12a comprising an R1226Aamino acid substitution; Yamano et al., 2016), Cas12i (Yan et al. 2019),a zinc finger nickase e.g., as disclosed in Kim et al., 2012), a TALEnickase (e.g., as disclosed in Wu et al., 2014), or a combinationthereof. In certain embodiments, systems that provide for nicking cancomprise a Cas nuclease (e.g., Cas9 and/or Cas12a) and guide RNAmolecules that have at least one base mismatch to DNA sequences in thetarget editing site (Fu et al., 2019). In certain embodiments, genomemodifications can be introduced into the target editing site by creatingsingle stranded breaks (i.e., “nicks”) in genomic locations separated byno more than about 10, 20, 30, 40, 50, 60, 80, 100, 150, or 200 basepairs of DNA. In certain illustrative and non-limiting embodiments, twonickases (i.e., a CAS nuclease which introduces a single stranded DNAbreak including nCas9, nCas12a, Cas12i, zinc finger nickases, TALEnickases, combinations thereof, and the like) or nickase systems candirected to make cuts to nearby sites separated by no more than about10, 20, 30, 40, 50, 60, 80 or 100 base pairs of DNA. In instances wherean RNA guided nickase and an RNA guide are used, the RNA guides areadjacent to PAM sequences that are sufficiently close (i.e., separatedby no more than about 10, 20, 30, 40, 50, 60, 80, 100, 150, or 200 basepairs of DNA). For the purposes of gene editing, CRISPR arrays can bedesigned to contain one or multiple guide RNA sequences corresponding toa desired target DNA sequence; see, for example, Cong et al. (2013)Science, 339:819-823; Ran et al. (2013) Nature Protocols, 8:2281 - 2308.At least 16 or 17 nucleotides of gRNA sequence are required by Cas9 forDNA cleavage to occur; for Cpfl at least 16 nucleotides of gRNA sequenceare needed to achieve detectable DNA cleavage and at least 18nucleotides of gRNA sequence were reported necessary for efficient DNAcleavage in vitro; see Zetsche et al. (2015) Cell, 163:759 - 771. Inpractice, guide RNA sequences are generally designed to have a length of17 - 24 nucleotides (frequently 19, 20, or 21 nucleotides) and exactcomplementarity (i.e., perfect base-pairing) to the targeted gene ornucleic acid sequence; guide RNAs having less than 100% complementarityto the target sequence can be used (e.g., a gRNA with a length of 20nucleotides and 1-4 mismatches to the target sequence) but can increasethe potential for off-target effects. The design of effective guide RNAsfor use in plant genome editing is disclosed in U.S. Pat. ApplicationPublication 2015/0082478 A1, the entire specification of which isincorporated herein by reference. More recently, efficient gene editinghas been achieved using a chimeric “single guide RNA” (“sgRNA”), anengineered (synthetic) single RNA molecule that mimics a naturallyoccurring crRNA-tracrRNA complex and contains both a tracrRNA (forbinding the nuclease) and at least one crRNA (to guide the nuclease tothe sequence targeted for editing); see, for example, Cong et al. (2013)Science, 339:819 - 823; Xing et al. (2014) BMC Plant Biol., 14:327 -340. Chemically modified sgRNAs have been demonstrated to be effectivein genome editing; see, for example, Hendel et al. (2015) NatureBiotechnol., 985 - 991. The design of effective gRNAs for use in plantgenome editing is disclosed in U.S. Pat. Application Publication2015/0082478 A1, the entire specification of which is incorporatedherein by reference.

Genomic DNA may also be modified via base editing. Both adenine baseeditors (ABE) which convert A/T base pairs to G/C base pairs in genomicDNA as well as cytosine base pair editors (CBE) which effect C to Tsubstitutions can be used in certain embodiments of the methods providedherein. In certain embodiments, useful ABE and CBE can comprise genomesite specific DNA binding elements (e.g., RNA-dependent DNA bindingproteins including catalytically inactive Cas9 and Cas12 proteins orCas9 and Cas12 nickases) operably linked to adenine or cytidinedeaminases and used with guide RNAs which position the protein near thenucleotide targeted for substitution. Suitable ABE and CBE disclosed inthe literature (Kim, Nat Plants, 2018 Mar;4(3):148-151) can be adaptedfor use in the methods set forth herein. In certain embodiments, a CBEcan comprise a fusion between a catalytically inactive Cas9 (dCas9) RNAdependent DNA binding protein fused to a cytidine deaminase whichconverts cytosine (C) to uridine (U) and selected guide RNAs, therebyeffecting a C to T substitution; see Komor et al. (2016) Nature,533:420 - 424. In other embodiments, C to T substitutions are effectedwith Cas9 nickase [Cas9n(D10A)] fused to an improved cytidine deaminaseand optionally a bacteriophage Mu dsDNA (double-stranded DNA)end-binding protein Gam; see Komor et al., Sci Adv. 2017 Aug;3(8):eaao4774. In other embodiments, adenine base editors (ABEs)comprising an adenine deaminase fused to catalytically inactive Cas9(dCas9) or a Cas9 D10A nickase can be used to convert A/T base pairs toG/C base pairs in genomic DNA (Gaudelli et al., (2017) Nature551(7681):464-471.

In certain embodiments, zinc finger nucleases or zinc finger nickasescan also be used in the methods provided herein. Zinc-finger nucleasesare site-specific endonucleases comprising two protein domains: aDNA-binding domain, comprising a plurality of individual zinc fingerrepeats that each recognize between 9 and 18 base pairs, and aDNA-cleavage domain that comprises a nuclease domain (typically Fokl).The cleavage domain dimerizes in order to cleave DNA; therefore, a pairof ZFNs are required to target non-palindromic target polynucleotides.In certain embodiments, zinc finger nuclease and zinc finger nickasedesign methods which have been described (Urnov et al. (2010) NatureRev. Genet., 11:636 - 646; Mohanta et al. (2017) Genes vol. 8,12: 399;Ramirez et al. Nucleic Acids Res. (2012); 40(12): 5560-5568; Liu et al.(2013) Nature Communications, 4: 2565) can be adapted for use in themethods set forth herein. The zinc finger binding domains of the zincfinger nuclease or nickase provide specificity and can be engineered tospecifically recognize any desired target DNA sequence. The zinc fingerDNA binding domains are derived from the DNA-binding domain of a largeclass of eukaryotic transcription factors called zinc finger proteins(ZFPs). The DNA-binding domain of ZFPs typically contains a tandem arrayof at least three zinc “fingers” each recognizing a specific triplet ofDNA. A number of strategies can be used to design the bindingspecificity of the zinc finger binding domain. One approach, termed“modular assembly”, relies on the functional autonomy of individual zincfingers with DNA. In this approach, a given sequence is targeted byidentifying zinc fingers for each component triplet in the sequence andlinking them into a multifinger peptide. Several alternative strategiesfor designing zinc finger DNA binding domains have also been developed.These methods are designed to accommodate the ability of zinc fingers tocontact neighboring fingers as well as nucleotide bases outside theirtarget triplet. Typically, the engineered zinc finger DNA binding domainhas a novel binding specificity, compared to a naturally-occurring zincfinger protein. Engineering methods include, for example, rationaldesign and various types of selection. Rational design includes, forexample, the use of databases of triplet (or quadruplet) nucleotidesequences and individual zinc finger amino acid sequences, in which eachtriplet or quadruplet nucleotide sequence is associated with one or moreamino acid sequences of zinc fingers which bind the particular tripletor quadruplet sequence. See, e.g., U.S. Pats. 6,453,242 and 6,534,261,both incorporated herein by reference in their entirety. Exemplaryselection methods (e.g., phage display and yeast two-hybrid systems) canbe adapted for use in the methods described herein. In addition,enhancement of binding specificity for zinc finger binding domains hasbeen described in U.S. Pat. 6,794,136, incorporated herein by referencein its entirety. In addition, individual zinc finger domains may belinked together using any suitable linker sequences. Examples of linkersequences are publicly known, e.g., see U.S. Pats. 6,479,626; 6,903,185;and 7,153,949, incorporated herein by reference in their entirety. Thenucleic acid cleavage domain is non-specific and is typically arestriction endonuclease, such as Fokl. This endonuclease must dimerizeto cleave DNA. Thus, cleavage by Fokl as part of a ZFN requires twoadjacent and independent binding events, which must occur in both thecorrect orientation and with appropriate spacing to permit dimerformation. The requirement for two DNA binding events enables morespecific targeting of long and potentially unique recognition sites.Fokl variants with enhanced activities have been described and can beadapted for use in the methods described herein; see, e.g., Guo et al.(2010) J. Mol. Biol., 400:96 - 107.

Transcription activator like effectors (TALEs) are proteins secreted bycertain Xanthomonas species to modulate gene expression in host plantsand to facilitate the colonization by and survival of the bacterium.TALEs act as transcription factors and modulate expression of resistancegenes in the plants. Recent studies of TALEs have revealed the codelinking the repetitive region of TALEs with their target DNA-bindingsites. TALEs comprise a highly conserved and repetitive regionconsisting of tandem repeats of mostly 33 or 34 amino acid segments. Therepeat monomers differ from each other mainly at amino acid positions 12and 13. A strong correlation between unique pairs of amino acids atpositions 12 and 13 and the corresponding nucleotide in the TALE-bindingsite has been found. The simple relationship between amino acid sequenceand DNA recognition of the TALE binding domain allows for the design ofDNA binding domains of any desired specificity. TALEs can be linked to anon-specific DNA cleavage domain to prepare genome editing proteins,referred to as TAL-effector nucleases or TALENs. As in the case of ZFNs,a restriction endonuclease, such as Fokl, can be conveniently used.Methods for use of TALENs in plants have been described and can beadapted for use in the methods described herein, see Mahfouz et al.(2011) Proc. Natl. Acad. Sci. USA, 108:2623 - 2628; Mahfouz (2011) GMCrops, 2:99 - 103; and Mohanta et al. (2017) Genes vol. 8,12: 399). TALEnickases have also been described and can be adapted for use in methodsdescribed herein (Wu et al.; Biochem Biophys Res Commun.(2014);446(1):261-6; Luo et al; Scientific Reports 6, Article number:20657 (2016)).

Embodiments of the donor DNA template molecule having a sequence that isintegrated at the site of at least one double-strand break (DSB) in agenome include double-stranded DNA, a single-stranded DNA, asingle-stranded DNA/RNA hybrid, and a double-stranded DNA/RNA hybrid. Inembodiments, a donor DNA template molecule that is a double-stranded(e.g., a dsDNA or dsDNA/RNA hybrid) molecule is provided directly to theplant protoplast or plant cell in the form of a double-stranded DNA or adouble-stranded DNA/RNA hybrid, or as two single-stranded DNA (ssDNA)molecules that are capable of hybridizing to form dsDNA, or as asingle-stranded DNA molecule and a single-stranded RNA (ssRNA) moleculethat are capable of hybridizing to form a double-stranded DNA/RNAhybrid; that is to say, the double-stranded polynucleotide molecule isnot provided indirectly, for example, by expression in the cell of adsDNA encoded by a plasmid or other vector. In various non-limitingembodiments of the method, the donor DNA template molecule that isintegrated (or that has a sequence that is integrated) at the site of atleast one double-strand break (DSB) in a genome is double-stranded andblunt-ended; in other embodiments the donor DNA template molecule isdouble-stranded and has an overhang or “sticky end” consisting ofunpaired nucleotides (e.g., 1, 2, 3, 4, 5, or 6 unpaired nucleotides) atone terminus or both termini. In an embodiment, the DSB in the genomehas no unpaired nucleotides at the cleavage site, and the donor DNAtemplate molecule that is integrated (or that has a sequence that isintegrated) at the site of the DSB is a blunt-ended double-stranded DNAor blunt-ended double-stranded DNA/RNA hybrid molecule, or alternativelyis a single-stranded DNA or a single-stranded DNA/RNA hybrid molecule.In another embodiment, the DSB in the genome has one or more unpairednucleotides at one or both sides of the cleavage site, and the donor DNAtemplate molecule that is integrated (or that has a sequence that isintegrated) at the site of the DSB is a double-stranded DNA ordouble-stranded DNA/RNA hybrid molecule with an overhang or “sticky end”consisting of unpaired nucleotides at one or both termini, oralternatively is a single-stranded DNA or a single-stranded DNA/RNAhybrid molecule; in embodiments, the donor DNA template molecule DSB isa double-stranded DNA or double-stranded DNA/RNA hybrid molecule thatincludes an overhang at one or at both termini, wherein the overhangconsists of the same number of unpaired nucleotides as the number ofunpaired nucleotides created at the site of a DSB by a nuclease thatcuts in an off-set fashion (e.g., where a Cas12 nuclease effects anoff-set DSB with 5-nucleotide overhangs in the genomic sequence, thedonor DNA template molecule that is to be integrated (or that has asequence that is to be integrated) at the site of the DSB isdouble-stranded and has 5 unpaired nucleotides at one or both termini).In certain embodiments, one or both termini of the donor DNA templatemolecule contain no regions of sequence homology (identity orcomplementarity) to genomic regions flanking the DSB; that is to say,one or both termini of the donor DNA template molecule contain noregions of sequence that is sufficiently complementary to permithybridization to genomic regions immediately adjacent to the location ofthe DSB. In embodiments, the donor DNA template molecule contains nohomology to the locus of the DSB, that is to say, the donor DNA templatemolecule contains no nucleotide sequence that is sufficientlycomplementary to permit hybridization to genomic regions immediatelyadjacent to the location of the DSB. In embodiments, the donor DNAtemplate molecule is at least partially double-stranded and includes2-20 base-pairs, e. g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,16, 17, 18, 19, or 20 base-pairs; in embodiments, the donor DNA templatemolecule is double-stranded and blunt-ended and consists of 2-20base-pairs, e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,17, 18, 19, or 20 base-pairs; in other embodiments, the donor DNAtemplate molecule is double-stranded and includes 2-20 base-pairs, e.g.,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20base-pairs and in addition has at least one overhang or “sticky end”consisting of at least one additional, unpaired nucleotide at one or atboth termini. In an embodiment, the donor DNA template molecule that isintegrated (or that has a sequence that is integrated) at the site of atleast one double-strand break (DSB) in a genome is a blunt-endeddouble-stranded DNA or a blunt-ended double-stranded DNA/RNA hybridmolecule of about 18 to about 300 base-pairs, or about 20 to about 200base-pairs, or about 30 to about 100 base-pairs, and having at least onephosphorothioate bond between adjacent nucleotides at a 5′ end, 3′ end,or both 5′ and 3′ ends. In embodiments, the donor DNA template moleculeincludes single strands of at least 11, at least 18, at least 20, atleast 30, at least 40, at least 60, at least 80, at least 100, at least120, at least 140, at least 160, at least 180, at least 200, at least240, at about 280, or at least 320 nucleotides. In embodiments, thedonor DNA template molecule has a length of at least 2, at least 3, atleast 4, at least 5, at least 6, at least 7, at least 8, at least 9, atleast 10, or at least 11 base-pairs if double-stranded (or nucleotidesif single-stranded), or between about 2 to about 320 base-pairs ifdouble-stranded (or nucleotides if single-stranded), or between about 2to about 500 base-pairs if double-stranded (or nucleotides ifsingle-stranded), or between about 5 to about 500 base-pairs ifdouble-stranded (or nucleotides if single-stranded), or between about 5to about 300 base-pairs if double-stranded (or nucleotides ifsingle-stranded), or between about 11 to about 300 base-pairs ifdouble-stranded (or nucleotides if single-stranded), or about 18 toabout 300 base-pairs if double-stranded (or nucleotides ifsingle-stranded), or between about 30 to about 100 base-pairs ifdouble-stranded (or nucleotides if single-stranded). In embodiments, thedonor DNA template molecule includes chemically modified nucleotides(see, e.g., the various modifications of internucleotide linkages,bases, and sugars described in Verma and Eckstein (1998) Annu. Rev.Biochem., 67:99-134); in embodiments, the naturally occurringphosphodiester backbone of the donor DNA template molecule is partiallyor completely modified with phosphorothioate, phosphorodithioate, ormethylphosphonate internucleotide linkage modifications, or the donorDNA template molecule includes modified nucleoside bases or modifiedsugars, or the donor DNA template molecule is labelled with afluorescent moiety (e.g., fluorescein or rhodamine or a fluorescentnucleoside analogue) or other detectable label (e.g., biotin or anisotope). In another embodiment, the donor DNA template moleculecontains secondary structure that provides stability or acts as anaptamer. Other related embodiments include double-stranded DNA/RNAhybrid molecules, single-stranded DNA/RNA hybrid donor molecules, andsingle-stranded donor DNA template molecules (including single-stranded,chemically modified donor DNA template molecules), which in analogousprocedures are integrated (or have a sequence that is integrated) at thesite of a double-strand break. Donor DNA templates provided hereininclude those comprising CgRRS sequences flanked by DNA with homology toa donor polynucleotide and include the donor DNA template set forth inSEQ ID NO: 11 and equivalents thereof with longer or shorter homologyarms. In certain embodiments, a donor DNA template can comprise anadapter molecule (e.g., a donor DNA template formed by annealing singlestranded DNAs which do not overlap at their 5′ and 3′ terminal ends)with cohesive ends which can anneal to an overhanging cleavage site(e.g., introduced by a Cas12a nuclease and suitable gRNAs). In certainembodiments, integration of the donor DNA templates can be facilitatedby use of a bacteriophage lambda exonuclease, a bacteriophage lambdabeta SSAP protein, and an E. coli SSB essentially as set forth in U.S.Pat. Application Publication 20200407754, which is incorporated hereinby reference in its entirety.

Donor DNA template molecules used in the methods provided herein includeDNA molecules comprising, from 5′ to 3′, a first homology arm, areplacement DNA, and a second homology arm, wherein the homology armscontaining sequences that are partially or completely homologous togenomic DNA (gDNA) sequences flanking a target site-specificendonuclease cleavage site in the gDNA. In certain embodiments, thereplacement DNA can comprise an insertion, deletion, or substitution of1 or more DNA base pairs relative to the target gDNA. In an embodiment,the donor DNA template molecule is double-stranded and perfectlybase-paired through all or most of its length, with the possibleexception of any unpaired nucleotides at either terminus or bothtermini. In another embodiment, the donor DNA template molecule isdouble-stranded and includes one or more non-terminal mismatches ornon-terminal unpaired nucleotides within the otherwise double-strandedduplex. In an embodiment, the donor DNA template molecule that isintegrated at the site of at least one double-strand break (DSB)includes between 2-20 nucleotides in one (if single-stranded) or in bothstrands (if double-stranded), e. g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,13, 14, 15, 16, 17, 18, 19, or 20 nucleotides on one or on both strands,each of which can be base-paired to a nucleotide on the opposite strand(in the case of a perfectly base-paired double-stranded polynucleotidemolecule). Such donor DNA templates can be integrated in genomic DNAcontaining blunt and/or staggered double stranded DNA breaks byhomology-directed repair (HDR). In certain embodiments, a donor DNAtemplate homology arm can be about 20, 50, 100, 200, 400, or 600 toabout 800, or 1000 base pairs in length. In certain embodiments, a donorDNA template molecule can be delivered to a plant cell) in a circular(e.g., a plasmid or a viral vector including a geminivirus vector) or alinear DNA molecule. In certain embodiments, a circular or linear DNAmolecule that is used can comprise a modified donor DNA templatemolecule comprising, from 5′ to 3′, a first copy of the targetsequence-specific endonuclease cleavage site sequence, the firsthomology arm, the replacement DNA, the second homology arm, and a secondcopy of the target sequence-specific endonuclease cleavage sitesequence. Without seeking to be limited by theory, such modified donorDNA template molecules can be cleaved by the same sequence-specificendonuclease that is used to cleave the target site gDNA of theeukaryotic cell to release a donor DNA template molecule that canparticipate in HDR-mediated genome modification of the target editingsite in the plant cell genome. In certain embodiments, the donor DNAtemplate can comprise a linear DNA molecule comprising, from 5′ to 3′, acleaved target sequence-specific endonuclease cleavage site sequence,the first homology arm, the replacement DNA, the second homology arm,and a cleaved target sequence-specific endonuclease cleavage sitesequence. In certain embodiments, the cleaved target sequence-specificendonuclease sequence can comprise a blunt DNA end or a blunt DNA endthat can optionally comprise a 5′ phosphate group. In certainembodiments, the cleaved target sequence-specific endonuclease sequencecomprises a DNA end having a single-stranded 5′ or 3′ DNA overhang. Suchcleaved target sequence-specific endonuclease cleavage site sequencescan be produced by either cleaving an intact target sequence-specificendonuclease cleavage site sequence or by synthesizing a copy of thecleaved target sequence-specific endonuclease cleavage site sequence.Donor DNA templates can be synthesized either chemically orenzymatically (e.g., in a polymerase chain reaction (PCR)). Donor DNAtemplates provided herein include those comprising CgRRS sequencesflanked by DNA with homology to a donor polynucleotide. An example of auseful DNA donor template provided herein is a DNA molecule comprisingSEQ ID NO: 11.

Various treatments are useful in delivery of gene editing moleculesand/or other molecules to a DAS44406-6 or INHT26 plant cell. In certainembodiments, one or more treatments is employed to deliver the geneediting or other molecules (e.g., comprising a polynucleotide,polypeptide or combination thereof) into a eukaryotic or plant cell,e.g., through barriers such as a cell wall, a plasma membrane, a nuclearenvelope, and/or other lipid bilayer. In certain embodiments, apolynucleotide-, polypeptide-, or RNP-containing composition comprisingthe molecules are delivered directly, for example by direct contact ofthe composition with a plant cell. Aforementioned compositions can beprovided in the form of a liquid, a solution, a suspension, an emulsion,a reverse emulsion, a colloid, a dispersion, a gel, liposomes, micelles,an injectable material, an aerosol, a solid, a powder, a particulate, ananoparticle, or a combination thereof can be applied directly to aplant, plant part, plant cell, or plant explant (e.g., through abrasionor puncture or otherwise disruption of the cell wall or cell membrane,by spraying or dipping or soaking or otherwise directly contacting, bymicroinjection). For example, a plant cell or plant protoplast is soakedin a liquid genome editing molecule-containing composition, whereby theagent is delivered to the plant cell. In certain embodiments, theagent-containing composition is delivered using negative or positivepressure, for example, using vacuum infiltration or application ofhydrodynamic or fluid pressure. In certain embodiments, theagent-containing composition is introduced into a plant cell or plantprotoplast, e.g., by microinjection or by disruption or deformation ofthe cell wall or cell membrane, for example by physical treatments suchas by application of negative or positive pressure, shear forces, ortreatment with a chemical or physical delivery agent such assurfactants, liposomes, or nanoparticles; see, e.g., delivery ofmaterials to cells employing microfluidic flow through a cell-deformingconstriction as described in U.S. Published Pat. Application2014/0287509, incorporated by reference in its entirety herein. Othertechniques useful for delivering the agent-containing composition to aeukaryotic cell, plant cell or plant protoplast include: ultrasound orsonication; vibration, friction, shear stress, vortexing, cavitation;centrifugation or application of mechanical force; mechanical cell wallor cell membrane deformation or breakage; enzymatic cell wall or cellmembrane breakage or permeabilization; abrasion or mechanicalscarification (e.g., abrasion with carborundum or other particulateabrasive or scarification with a file or sandpaper) or chemicalscarification (e.g., treatment with an acid or caustic agent); andelectroporation. In certain embodiments, the agent-containingcomposition is provided by bacterially mediated (e.g., Agrobacteriumsp., Rhizobium sp., Sinorhizobium sp., Mesorhizobium sp., Bradyrhizobiumsp., Azobacter sp., Phyllobacterium sp.) transfection of the plant cellor plant protoplast with a polynucleotide encoding the genome editingmolecules (e.g., RNA dependent DNA endonuclease, RNA dependent DNAbinding protein, RNA dependent nickase, ABE, or CBE, and/or guide RNA);see, e.g., Broothaerts et al. (2005) Nature, 433:629 - 633). Any ofthese techniques or a combination thereof are alternatively employed onthe plant explant, plant part or tissue or intact plant (or seed) fromwhich a plant cell is optionally subsequently obtained or isolated; incertain embodiments, the agent-containing composition is delivered in aseparate step after the plant cell has been isolated.

In some embodiments, one or more polynucleotides or vectors drivingexpression of one or more genome editing molecules or trait-conferringgenes (e.g., herbicide tolerance, insect resistance, and/or malesterility) are introduced into a DAS44406-6 or INHT26 plant cell. Incertain embodiments, a polynucleotide vector comprises a regulatoryelement such as a promoter operably linked to one or morepolynucleotides encoding genome editing molecules and/ortrait-conferring genes. In such embodiments, expression of thesepolynucleotides can be controlled by selection of the appropriatepromoter, particularly promoters functional in a eukaryotic cell (e.g.,plant cell); useful promoters include constitutive, conditional,inducible, and temporally or spatially specific promoters (e.g., atissue specific promoter, a developmentally regulated promoter, or acell cycle regulated promoter). Developmentally regulated promoters thatcan be used in plant cells include Phospholipid Transfer Protein (PLTP),fructose-1,6-bisphosphatase protein, NAD(P)-binding Rossmann-Foldprotein, adipocyte plasma membrane-associated protein-like protein,Rieske [2Fe-2S] iron-sulfur domain protein, chlororespiratory reduction6 protein, D-glycerate 3-kinase, chloroplastic-like protein, chlorophylla-b binding protein 7, chloroplastic-like protein,ultraviolet-B-repressible protein, Soul heme-binding family protein,Photosystem I reaction center subunit psi-N protein, and short-chaindehydrogenase/reductase protein that are disclosed in U.S. Pat.Application Publication No. 20170121722, which is incorporated herein byreference in its entirety and specifically with respect to suchdisclosure. In certain embodiments, the promoter is operably linked tonucleotide sequences encoding multiple guide RNAs, wherein the sequencesencoding guide RNAs are separated by a cleavage site such as anucleotide sequence encoding a microRNA recognition/cleavage site or aself-cleaving ribozyme (see, e.g., Ferre-D′Amare and Scott (2014) ColdSpring Harbor Perspectives Biol., 2:a003574). In certain embodiments,the promoter is an RNA polymerase III promoter operably linked to anucleotide sequence encoding one or more guide RNAs. In certainembodiments, the RNA polymerase III promoter is a plant U6 spliceosomalRNA promoter, which can be native to the genome of the plant cell orfrom a different species, e.g., a U6 promoter from soybean, tomato, orsoybean such as those disclosed U.S. Pat. Application Publication2017/0166912, or a homologue thereof; in an example, such a promoter isoperably linked to DNA sequence encoding a first RNA molecule includinga Cas12a gRNA followed by an operably linked and suitable 3′ elementsuch as a U6 poly-T terminator. In another embodiment, the RNApolymerase III promoter is a plant U3, 7SL (signal recognition particleRNA), U2, or U5 promoter, or chimerics thereof, e.g., as described inU.S. Pat. Application Publication 20170166912. In certain embodiments,the promoter operably linked to one or more polynucleotides is aconstitutive promoter that drives gene expression in eukaryotic cells(e.g., plant cells). In certain embodiments, the promoter drives geneexpression in the nucleus or in an organelle such as a chloroplast ormitochondrion. Examples of constitutive promoters for use in plantsinclude a CaMV 35S promoter as disclosed in U.S. Pats. 5,858,742 and5,322,938, a rice actin promoter as disclosed in U.S. Pat. 5,641,876, asoybean chloroplast aldolase promoter as disclosed in U.S. Pat.7,151,204, and the nopaline synthase (NOS) and octopine synthase (OCS)promoters from Agrobacterium tumefaciens. In certain embodiments, thepromoter operably linked to one or more polynucleotides encodingelements of a genome-editing system is a promoter from figwort mosaicvirus (FMV), a RUBISCO promoter, or a pyruvate phosphate dikinase (PPDK)promoter, which is active in photosynthetic tissues. Other contemplatedpromoters include cell-specific or tissue-specific or developmentallyregulated promoters, for example, a promoter that limits the expressionof the nucleic acid targeting system to germline or reproductive cells(e.g., promoters of genes encoding DNA ligases, recombinases,replicases, or other genes specifically expressed in germline orreproductive cells). In certain embodiments, the genome alteration islimited only to those cells from which DNA is inherited in subsequentgenerations, which is advantageous where it is desirable that expressionof the genome-editing system be limited in order to avoid genotoxicityor other unwanted effects. All of the patent publications referenced inthis paragraph are incorporated herein by reference in their entirety.

Expression vectors or polynucleotides provided herein may contain a DNAsegment near the 3′ end of an expression cassette that acts as a signalto terminate transcription and directs polyadenylation of the resultantmRNA and may also support promoter activity. Such a 3′ element iscommonly referred to as a “3′-untranslated region” or “3′-UTR” or a“polyadenylation signal.” In some cases, plant gene-based 3′ elements(or terminators) consist of both the 3′-UTR and downstreamnon-transcribed sequence (Nuccio et al., 2015). Useful 3′ elementsinclude: Agrobacterium tumefaciens nos 3′, tml 3′, tmr 3′, tms 3′, ocs3′, and tr7 3′ elements disclosed in U.S. Pat. No. 6,090,627,incorporated herein by reference, and 3′ elements from plant genes suchas the heat shock protein 17, ubiquitin, and fructose-1,6-biphosphatasegenes from wheat (Triticum aestivum), and the glutelin, lactatedehydrogenase, and beta-tubulin genes from rice (Oryza sativa),disclosed in U.S. Pat. Application Publication 2002/0192813 A1. All ofthe patent publications referenced in this paragraph are incorporatedherein by reference in their entireties.

In certain embodiments, the DAS44406-6 or INHT26 plant cells used hereincan comprise haploid, diploid, or polyploid plant cells or plantprotoplasts, for example, those obtained from a haploid, diploid, orpolyploid plant, plant part or tissue, or callus. In certainembodiments, plant cells in culture (or the regenerated plant, progenyseed, and progeny plant) are haploid or can be induced to becomehaploid; techniques for making and using haploid plants and plant cellsare known in the art, see, e.g., methods for generating haploids inArabidopsis thaliana by crossing of a wild-type strain to ahaploid-inducing strain that expresses altered forms of thecentromere-specific histone CENH3, as described by Maruthachalam andChan in “How to make haploid Arabidopsis thaliana”, protocol availableatwww[dot]openwetware[dot]org/images/d/d3/Haploid_Arabidopsis_protocol[dot]pdf;(Ravi et al. (2014) Nature Communications, 5:5334, doi:10.1038/ncomms6334). Haploids can also be obtained in a wide variety ofmonocot plants (e.g., soybean, wheat, rice, sorghum, barley) by crossinga plant comprising a mutated CENH3 gene with a wildtype diploid plant togenerate haploid progeny as disclosed in U.S. Pat. No. 9,215,849, whichis incorporated herein by reference in its entirety. Haploid-inducingsoybean lines that can be used to obtain haploid soybean plants and/orcells include Stock 6, MHI (Moldovian Haploid Inducer), indeterminategametophyte (ig) mutation, KEMS, RWK, ZEM, ZMS, KMS, and well astransgenic haploid inducer lines disclosed in U.S. Pat. No. 9,677,082,which is incorporated herein by reference in its entirety. Examples ofhaploid cells include but are not limited to plant cells obtained fromhaploid plants and plant cells obtained from reproductive tissues, e.g.,from flowers, developing flowers or flower buds, ovaries, ovules,megaspores, anthers, pollen, megagametophyte, and microspores. Incertain embodiments where the plant cell or plant protoplast is haploid,the genetic complement can be doubled by chromosome doubling (e.g., byspontaneous chromosomal doubling by meiotic non-reduction, or by using achromosome doubling agent such as colchicine, oryzalin, trifluralin,pronamide, nitrous oxide gas, anti-microtubule herbicides,anti-microtubule agents, and mitotic inhibitors) in the plant cell orplant protoplast to produce a doubled haploid plant cell or plantprotoplast wherein the complement of genes or alleles is homozygous; yetother embodiments include regeneration of a doubled haploid plant fromthe doubled haploid plant cell or plant protoplast. Another embodimentis related to a hybrid plant having at least one parent plant that is adoubled haploid plant provided by this approach. Production of doubledhaploid plants provides homozygosity in one generation, instead ofrequiring several generations of self-crossing to obtain homozygousplants. The use of doubled haploids is advantageous in any situationwhere there is a desire to establish genetic purity (i.e., homozygosity)in the least possible time. Doubled haploid production can beparticularly advantageous in slow-growing plants or for producing hybridplants that are offspring of at least one doubled-haploid plant.

In certain embodiments, the DAS44406-6 or INHT26 plant cells used in themethods provided herein can include non-dividing cells. Suchnon-dividing cells can include plant cell protoplasts, plant cellssubjected to one or more of a genetic and/or pharmaceutically-inducedcell-cycle blockage, and the like.

In certain embodiments, the DAS44406-6 or INHT26 plant cells in used inthe methods provided herein can include dividing cells. Dividing cellscan include those cells found in various plant tissues including leaves,meristems, and embryos. These tissues include dividing cells from youngsoybean leaf, meristems and scutellar tissue from about 8 or 10 to about12 or 14 days after pollination (DAP) embryos. The isolation of soybeanembryos has been described in several publications (Brettschneider,Becker, and Lörz 1997; Leduc et al. 1996; Frame et al. 2011; K. Wang andFrame 2009). In certain embodiments, basal leaf tissues (e.g., leaftissues located about 0 to 3 cm from the ligule of a soybean plant;Kirienko, Luo, and Sylvester 2012) are targeted for HDR-mediated geneediting. Methods for obtaining regenerable plant structures andregenerating plants from the NHEJ-, MMEJ-, or HDR-mediated gene editingof plant cells provided herein can be adapted from methods disclosed inU.S. Pat. Application Publication No. 20170121722, which is incorporatedherein by reference in its entirety and specifically with respect tosuch disclosure. In certain embodiments, single plant cells subjected tothe HDR-mediated gene editing will give rise to single regenerable plantstructures. In certain embodiments, the single regenerable plant cellstructure can form from a single cell on, or within, an explant that hasbeen subjected to the NHEJ-, MMEJ-, or HDR-mediated gene editing.

In some embodiments, methods provided herein can include the additionalstep of growing or regenerating an INHT26 plant from a INHT26 plant cellthat had been subjected to the gene editing or from a regenerable plantstructure obtained from that INHT26 plant cell. In certain embodiments,the plant can further comprise an inserted transgene, a target geneedit, or genome edit as provided by the methods and compositionsdisclosed herein. In certain embodiments, callus is produced from theplant cell, and plantlets and plants produced from such callus. In otherembodiments, whole seedlings or plants are grown directly from the plantcell without a callus stage. Thus, additional related aspects aredirected to whole seedlings and plants grown or regenerated from theplant cell or plant protoplast having a target gene edit or genome edit,as well as the seeds of such plants. In certain embodiments wherein theplant cell or plant protoplast is subjected to genetic modification (forexample, genome editing by means of, e.g., an RdDe), the grown orregenerated plant exhibits a phenotype associated with the geneticmodification. In certain embodiments, the grown or regenerated plantincludes in its genome two or more genetic or epigenetic modificationsthat in combination provide at least one phenotype of interest. Incertain embodiments, a heterogeneous population of plant cells having atarget gene edit or genome edit, at least some of which include at leastone genetic or epigenetic modification, is provided by the method;related aspects include a plant having a phenotype of interestassociated with the genetic or epigenetic modification, provided byeither regeneration of a plant having the phenotype of interest from aplant cell or plant protoplast selected from the heterogeneouspopulation of plant cells having a target gene or genome edit, or byselection of a plant having the phenotype of interest from aheterogeneous population of plants grown or regenerated from thepopulation of plant cells having a targeted genetic edit or genome edit.Examples of phenotypes of interest include herbicide resistance,improved tolerance of abiotic stress (e.g., tolerance of temperatureextremes, drought, or salt) or biotic stress (e.g., resistance tonematode, bacterial, or fungal pathogens), improved utilization ofnutrients or water, modified lipid, carbohydrate, or proteincomposition, improved flavor or appearance, improved storagecharacteristics (e.g., resistance to bruising, browning, or softening),increased yield, altered morphology (e.g., floral architecture or color,plant height, branching, root structure). In an embodiment, aheterogeneous population of plant cells having a target gene edit orgenome edit (or seedlings or plants grown or regenerated therefrom) isexposed to conditions permitting expression of the phenotype ofinterest; e.g., selection for herbicide resistance can include exposingthe population of plant cells having a target gene edit or genome edit(or seedlings or plants grown or regenerated therefrom) to an amount ofherbicide or other substance that inhibits growth or is toxic, allowingidentification and selection of those resistant plant cells (orseedlings or plants) that survive treatment. Methods for obtainingregenerable plant structures and regenerating plants from plant cells orregenerable plant structures can be adapted from published procedures(Roest and Gilissen, Acta Bot. Neerl., 1989, 38(1), 1-23; Bhaskaran andSmith, Crop Sci. 30(6): 1328-1337; Ikeuchi et al., Development, 2016,143: 1442-1451). Methods for obtaining regenerable plant structures andregenerating plants from plant cells or regenerable plant structures canalso be adapted from U.S. Pat. Application Publication No. 20170121722,which is incorporated herein by reference in its entirety andspecifically with respect to such disclosure. Also provided areheterogeneous or homogeneous populations of such plants or parts thereof(e.g., seeds), succeeding generations or seeds of such plants grown orregenerated from the plant cells or plant protoplasts, having a targetgene edit or genome edit. Additional related aspects include a hybridplant provided by crossing a first plant grown or regenerated from aplant cell or plant protoplast having a target gene edit or genome editand having at least one genetic or epigenetic modification, with asecond plant, wherein the hybrid plant contains the genetic orepigenetic modification; also contemplated is seed produced by thehybrid plant. Also envisioned as related aspects are progeny seed andprogeny plants, including hybrid seed and hybrid plants, having theregenerated plant as a parent or ancestor. The plant cells andderivative plants and seeds disclosed herein can be used for variouspurposes useful to the consumer or grower. In other embodiments,processed products are made from the INHT26 plant or its seeds,including: (a) soybean seed meal (defatted or non-defatted); (b)extracted proteins, oils, sugars, and starches; (c) fermentationproducts; (d) animal feed or human food products (e.g., feed and foodcomprising soybean seed meal (defatted or non-defatted) and otheringredients (e.g., other cereal grains, other seed meal, other proteinmeal, other oil, other starch, other sugar, a binder, a preservative, ahumectant, a vitamin, and/or mineral; (e) a pharmaceutical; (f) raw orprocessed biomass (e.g., cellulosic and/or lignocellulosic material);and (g) various industrial products.

EMBODIMENTS

Various embodiments of the plants, genomes, methods, biological samples,and other compositions described herein are set forth in the followingsets of numbered embodiments.

1a. A transgenic soybean plant cell comprising an INHT26 transgeniclocus comprising an originator guide RNA recognition site (OgRRS) in afirst DNA junction polynucleotide of a DAS44406-6 transgenic locus and acognate guide RNA recognition site (CgRRS) in a second DNA junctionpolynucleotide of the DAS44406-6 transgenic locus.

1b. A transgenic soybean plant cell comprising an INHT26 transgeniclocus comprising an insertion and/or substitution of DNA in a DNAjunction polynucleotide of a DAS44406-6 transgenic locus with DNAcomprising a cognate guide RNA recognition site (CgRRS).

2. The transgenic soybean plant cell of embodiment 1a or 1b, whereinsaid CgRRS comprises the DNA molecule set forth in SEQ ID NO: 16 or 17;and/or wherein said DAS44406-6 transgenic locus is set forth in SEQ IDNO: 1, is present in seed deposited at the ATCC under accession No.PTA-11336, is present in progeny thereof, is present in allelic variantsthereof, or is present in other variants thereof.

3. The transgenic soybean plant cell of embodiments 1a, 1b, or 2,wherein said INHT26 transgenic locus comprises the DNA molecule setforth in SEQ ID NO: 14 or an allelic variant thereof.

4. A transgenic soybean plant part comprising the soybean plant cell ofany one of embodiments 1a, 1b, 2, or 3, wherein said soybean plant partis optionally a seed.

5. A transgenic soybean plant comprising the soybean plant cell of anyone of embodiments 1a, 1b, 2, or 3.

6. A method for obtaining a bulked population of inbred seed comprisingselfing the transgenic soybean plant of embodiment 5 and harvesting seedcomprising the INHT26 transgenic locus from the selfed soybean plant.

7. A method of obtaining hybrid soybean seed comprising crossing thetransgenic soybean plant of embodiment 5 to a second soybean plant whichis genetically distinct from the first soybean plant and harvesting seedcomprising the INHT26 transgenic locus from the cross.

8. A DNA molecule comprising SEQ ID NO: 14, 16, 17, 14 or an allelicvariant thereof.

9. A processed transgenic soybean plant product comprising the DNAmolecule of embodiment 8.

10. A biological sample containing the DNA molecule of embodiment 8.

11. A nucleic acid molecule adapted for detection of genomic DNAcomprising the DNA molecule of embodiment 8, wherein said nucleic acidmolecule optionally comprises a detectable label.

12. A method of detecting a soybean plant cell comprising the INHT26transgenic locus of any one of embodiments 1a, 1b, 2, or 3, comprisingthe step of detecting DNA molecule comprising SEQ ID NO: 14, 16, 17, 14or an allelic variant thereof.

13. A method of excising the INHT26 transgenic locus from the genome ofthe soybean plant cell of any one of embodiments 1a, 1b, 2, or 3,comprising the steps of:

-   (a) contacting the plant cell or genome thereof with: (i) an RNA    dependent DNA endonuclease (RdDe); and (ii) a guide RNA (gRNA)    capable of hybridizing to the guide RNA hybridization site of the    OgRRS and the CgRRS; wherein the RdDe recognizes a OgRRS/gRNA and a    CgRRS/gRNA hybridization complex; and,-   (b) selecting a transgenic plant cell, transgenic plant part, or    transgenic plant wherein the INHT26 transgenic locus flanked by the    OgRRS and the CgRRS has been excised.

EXAMPLES Example 1. Application of a Cas12a RNA Guided Endonuclease andGuide RNAs to Change or Excise the 3′-T-DNA Junction Sequence in theDAS44406 Event

The DAS44406 3′ junction polynucleotide sequence set forth in SEQ ID NO:3 is flanked by five Cas12a recognition sequences. The Guide-1 andGuide-2 sequences are located at 5′-end of SEQ ID NO: 3 and Guides-3-5lie within the 3′ junction polynucleotide sequence of SEQ ID NO: 3.These can be used to modify some of the 3′ junction polynucleotidesequence or eliminate most of it. There are several iterations of thisapproach. In one embodiment, Guide-3, Guide-4, or Guide-5 are used aloneto disrupt the DAS44406 3′-junction sequence (e.g., by using a Cas12aendonuclease and 1 of Guide-3, Guide-4, or Guide-5 to cleave the 3′junction polynucleotide sequence and recovering genomic edits where the3′ DNA junction polynucleotide sequence of DAS44406 is disrupted. Inanother embodiment, Guide-1 or Guide-2 is used with either Guide-3,Guide-4 or Guide-5 to eliminate most of the DAS44406 3′ junctionpolynucleotide sequence.

TABLE ONE Description of Guide RNAs and SEQ ID NO of DNA encoding sameGuide RNA ID SEQ ID NO Start-End in SEQ ID NO: 1 Strand of SEQ ID NO: 1PAM Guide-1 4 11656-11678 -1 TTTA Guide-2 5 11646-11688 1 TTTA Guide-3 611725-11747 1 TTTG Guide-4 7 11730-11752 1 TTTA Guide-5 8 11773-11795 1TTTA

The Cas12a nuclease and the single or combined guide RNAs are introducedinto soybean plant cells containing the DAS44406-6 event. In certainembodiments, the Cas12a nuclease and gRNA(s) are encoded and expressedfrom a T-DNA transformed into the DAS44406-6 event viaAgrobacterium-mediated transformation. Alternatively, the T-DNA can betransformed into any convenient soy line, and then crossed with theDAS44406-6 event to combine the Cas12a ribonucleoprotein expressingT-DNA with the DAS44406-6 event. The Cas12a nuclease and gRNAs can alsobe assembled in vitro then delivered to DAS44406-6 explants asribonucleoprotein complexes using a biolistic approach (Svitashev etal., Nat Commun. 2016; 7:13274; Zhang et al., 2021, Plant Commun.2(2):100168). Also, a plasmid encoding a Cas12a nuclease and the gRNA(s)can be delivered to DAS44406-6 explants using a biolistic approach. Thiswill produce plant cells that have a high likelihood of incurringmutations that disrupt the DAS44406-6 3′ junction polynucleotidesequence.

In the Agrobacterium approach, a binary vector that contains a strongconstitutive expression cassette like the AtUbi10 promoter::AtUbi10terminator driving Cas12a, a PolII or PolII gene cassette driving theCas12a gRNA(s) and a CaMV 35S:NPTII:NOS or other suitable plantselectable marker (e.g., a phosphomannose isomerase (Reed et al. 2001,In Vitro Cellular & Developmental Biology - Plant 37: 127-132) orhygromycin phosphotransferase (Itaya, et al. 2018, In Vitro Cellular &Developmental Biology -Plant 54: 184-194)) is constructed and cellscomprising the integrated T-DNA(s) are selected using an appropriateselection agent. An expression cassette driving a fluorescent proteinlike mScarlet may also be useful to monitor the plant transformationprocess.

The T-DNA-based expression cassettes are delivered from superbinaryvectors in Agrobacterium strain LBA4404. Soy transformations areperformed based on published methods (Zhang et al., 1999, Plant Cell,Tissue and Organ Culture 56(1), 37-46). Briefly, cotyledonary explantsare prepared from the 5-day-old soybean seedlings by making a horizontalslice through the hypocotyl region, approximately 3-5 mm below thecotyledon. A subsequent vertical slice is made between the cotyledons,and the embryonic axis is removed. This generates 2 cotyledonary nodeexplants. Approximately 7-12 vertical slices are made on the adaxialsurface of the explant about the area encompassing 3 mm above thecotyledon/hypocotyl junction and 1 mm below the cotyledon/hypocotyljunction. Explant manipulations are done with a No. 15 scalpel blade.

Explants are immersed in the Agrobacterium inoculum for 30 min and thenco-cultured on 100 × 15 mm Petri plates containing the Agrobacteriumresuspension medium solidified with 0.5% purified agar (BBL Cat #11853). The co-cultivation plates are overlaid with a piece of Whatman#1 filter paper (Mullins et al., 1990; Janssen and Gardner, 1993; Zhanget al., 1997). The explants (5 per plate) are cultured adaxial side downon the co-cultivation plates, that are overlaid with filter paper, for 3days at 24° C., under an 18/6 hour light regime with an approximatelight intensity of 80 µmol s⁻¹ m⁻² (F17T8/750 cool white bulbs,Litetronics®). The co-cultivation plates are wrapped with Parafilm®.Following the co-cultivation period explants are briefly washed in B5medium supplemented with 1.67 mg 1⁻¹ BAP, 3% sucrose, 500 mg 1⁻¹ticarcillin and 100 mg 1⁻¹ cefotaxime. The medium is buffered with 3 mMMES, pH 5.6. Growth regulator, vitamins and antibiotics are filtersterilized post autoclaving. Following the washing step, explants arecultured (5 per plate) in 100 × 20 mm Petri plates, adaxial side up withthe hypocotyl imbedded in the medium, containing the washing mediumsolidified with 0.8% purified agar (BBL Cat # 11853) amended with eitherG418, neomycin, or kanamycin at concentrations permitting selection oftransformants. This medium is referred to as shoot initiation medium(SI). Plates are wrapped with 3 M pressure sensitive tape (Scotch™, 3 M,USA) and cultured under the environmental conditions used during theseed germination step (at 24° C., 18/6 light regime, under a lightintensity of approximately 150 µmol s⁻¹m⁻².

After 2 weeks of culture, the hypocotyl region is excised from each ofthe explants, and the remaining explant, cotyledon with differentiatingnode, is subsequently subcultured onto fresh SI medium. Following anadditional 2 weeks of culture on SI medium, the cotyledons are removedfrom the differentiating node. The differentiating node is subculturedto shoot elongation medium (SE) composed of Murashige and Skoog (MS)(1962) basal salts, B5 vitamins, 1 mg 1⁻¹ zeatin-riboside, 0.5 mg 1⁻¹GA3 and 0.1 mg 1⁻¹ IAA, 50 mg 1⁻ ¹ glutamine, 50 mg 1⁻¹ asparagine, 3%sucrose and 3 mM MES, pH 5.6. The SE medium is amended with G418,neomycin, or kanamycin at concentrations permitting selection oftransformants. The explants are subcultured biweekly to fresh SI mediumuntil shoots reach a length greater than 3 cm. The elongated shoots arerooted on Murashige and Skoog salts with B5 vitamins, 1% sucrose, 0.5 mg1⁻¹ NAA without further selection in Magenta boxes®.

When a sufficient amount of viable tissue is obtained, it can bescreened for mutations at the DAS44406-6 junction sequence, using aPCR-based approach. One way to screen is to design DNA oligonucleotideprimers that flank and amplify the DAS44406-6 junction plus surroundingsequence. For example, the primers (5′-AGCGGCCGGGTTTCTAGTCACCGGT-3′; SEQID NO: 9) and (5′-TCTCATTTTCACACATATACATGCA-3′; SEQ ID NO: 10) willproduce a ~440 bp product in a PCR reaction that can be analyzed foredits at the target site. The size of this product will vary based onthe nature of the edit. Amplicons can be sequenced directly using anamplicon sequencing approach or ligated to a convenient plasmid vectorfor Sanger sequencing. Those plants in which the DAS44406-6 3′-junctionsequence is disrupted are selected and grown to maturity. The DNAencoding the Cas12a reagents can be segregated away from the modifiedjunction sequence in a subsequent generation.

Example 2. Insertion of a CgRRS Element in the 3′-Junction of theDAS44406-6 Event

Two plant gene expression vectors are prepared. Plant expressioncassettes for expressing a bacteriophage lambda exonuclease, abacteriophage lambda beta SSAP protein, and an E. coli SSB areconstructed essentially as set forth in U.S. Pat. ApplicationPublication 20200407754, which is incorporated herein by reference inits entirety. A DNA sequence encoding a tobacco c2 nuclear localizationsignal (NLS) is fused in-frame to the DNA sequences encoding theexonuclease, the bacteriophage lambda beta SSAP protein, and the E. coliSSB to provide a DNA sequence encoding the c2 NLS-Exo, c2 NLS lambdabeta SSAP, and c2 NLS-SSB fusion proteins that are set forth in SEQ IDNO: 135, SEQ ID NO: 134, and SEQ ID NO: 133 of U.S. Pat. ApplicationPublication 20200407754, respectively, and incorporated herein byreference in their entireties. DNA sequences encoding the c2 NLS-Exo, c2NLS lambda beta SSAP, and c2NLS-SSB fusion proteins are operably linkedto suitable promoter(s) (e.g., AtUbi10, CaMV35S, and/or S1Ubi10promoter) and suitable polyadenylation site(s) (e.g., nos 3′, PeaE9 3′,tmr 3′, tms 3′, AtUbi10 3′, and tr7 3′ elements), to provide theexonuclease, SSAP, and SSB plant expression cassettes.

A DNA donor template sequence (SEQ ID NO: 11) that targets the 3′-T-DNAj unction polynucleotide of the DAS44406-6 event (SEQ ID NO: 1; FIG. 1 )for HDR-mediated insertion of a 27 base pair OgRRS sequence (SEQ ID NO:18) that is identical to a Cas12a recognition site at the 5′-junctionpolynucleotide of the DAS44406-6 T-DNA insert is constructed. The DNAdonor sequence includes a replacement template with desired insertionregion (27 base pairs long) flanked on both sides by homology arms about500-635 bp in length. The homology arms match (i.e., are homologous to)gDNA (genomic DNA) regions flanking the target genomic DNA insertionsite (SEQ ID NO: 3) in the DAS44406-6 transgenic locus (SEQ ID NO: 1).The replacement template region comprising the donor DNA is flanked ateach end by DNA sequences identical to the DAS44406-6 3′ junctionpolynucleotide sequence and contains a CgRRS element recognized by thesame Cas12a RNA-guided nuclease and a gRNA (e.g., comprising an RNAencoded by SEQ ID NO: 19) that recognize the OgRRS located in the 5′junction polynucleotide.

A plant expression cassette that provides for expression of theRNA-guided sequence-specific Cas12a endonuclease is constructed. A plantexpression cassette that provides for expression of a guide RNA (e.g.,encoded by SEQ ID NO: 8) complementary to sequences adjacent to theinsertion site is constructed. An Agrobacterium superbinary plasmidtransformation vector containing a cassette that provides for theexpression of a suitable plant selectable marker (e.g., a neomycinphosphotransferase (nptII) or hygromycin phosphotransferase (hpt)) isconstructed. Once the cassettes, donor sequence and Agrobacteriumsuperbinary plasmid transformation vector are constructed, they arecombined to generate two soybean transformation plasmids. In otherembodiments, other gRNAs (Guide-3 or Guide-4 alone or Guide -1 orGuide-2 with either Guide-3, Guide-4 or Guide-5) can be used tointroduce double stranded breaks in the DAS44406-6 3′ junctionpolynucleotide for insertion of a CgRRS using similar donor DNAtemplates and the aforementioned Cas12a, SSAP, SSB, and EXO reagents.

A soybean transformation plasmid is constructed with a neomycinphosphotransferase (nptII) or hygromycin phosphotransferase (hpt)cassette, the RNA-guided sequence-specific endonuclease cassette, theguide RNA cassette, and the DAS44406-6 3′-T_DNA junction sequence DNAdonor sequence into the Agrobacterium superbinary plasmid transformationvector (the control vector).

A soybean transformation plasmid is constructed with a neomycinphosphotransferase (nptII) or hygromycin phosphotransferase (hpt)cassette, the RNA-guided sequence-specific endonuclease cassette, theguide RNA cassette, the SSB cassette, the lambda beta SSAP cassette, theExo cassette, and the DAS44406-6 3′-T_DNA junction sequence donor DNAtemplate sequence (SEQ ID NO: 11) into the Agrobacterium superbinaryplasmid transformation vector (the lambda red vector).

All constructs are transformed into Agrobacterium strain LBA4404.

Soybean transformations are performed based on published methods (Ishidaet. al, Nature Protocols 2007; 2, 1614-1621). Briefly, immature embryosfrom inbred line GIBE0104, approximately 1.8-2.2 mm in size, areisolated from surface sterilized ears 10-14 days after pollination.Embryos are placed in an Agrobacterium suspension made with infectionmedium at a concentration of OD 600=1.0. Acetosyringone (200 µM) isadded to the infection medium at the time of use. Embryos andAgrobacterium are placed on a rocker shaker at slow speed for 15minutes. Embryos are then poured onto the surface of a plate ofco-culture medium. Excess liquid media is removed by tilting the plateand drawing off all liquid with a pipette. Embryos are flipped asnecessary to maintain a scutelum up orientation. Co-culture plates areplaced in a box with a lid and cultured in the dark at 22° C. for 3days. Embryos are then transferred to resting medium, maintaining thescutellum up orientation. Embryos remain on resting medium for 7 days at27-28° C. Embryos that produced callus are transferred to Selection 1medium with G418 or hygromycin at concentrations permitting selection oftransformants when a nptII or hpt selectable marker, respectively, isused and cultured for an additional 7 days. Callused embryos are placedon Selection 2 medium with 10 mg/L PPT and cultured for 14 days at27-28° C. Growing calli resistant to the selection agent are transferredto Pre-Regeneration media with 10 mg/L PPT to initiate shootdevelopment. Calli remains on Pre-Regeneration media for 7 days. Callibeginning to initiate shoots are transferred to Regeneration medium withG418 or hygromycin at concentrations permitting selection oftransformants when a nptII or hpt selectable marker is used inPhytatrays and cultured in light at 27-28° C. Shoots that reached thetop of the Phytatray with intact roots are isolated into ShootElongation medium prior to transplant into soil and gradualacclimatization to greenhouse conditions.

When a sufficient amount of viable tissue is obtained, it can bescreened for insertion at the DAS44406-6 junction sequence, using aPCR-based approach. The PCR primer on the 5′-end is5′-TATTGTCGCCGTATGTAATCGGCGT-3′ (SEQ ID NO: 12). The PCR primer on the3′-end is 5′-TTTTAGTTCAAGTCAACTTGTCAGT--3′ (SEQ ID NO: 13). The aboveprimers that flank donor DNA homology arms are used to amplify theDAS44406-6 3′-junction polynucleotide sequence. The correct donorsequence insertion will produce a 1366 bp product. A unique DNA fragmentcomprising the CgRRS in the DAS44406-6 3′ junction polynucleotide is setforth in SEQ ID NO: 17. Amplicons can be sequenced directly using anamplicon sequencing approach or ligated to a convenient plasmid vectorfor Sanger sequencing. Those plants in which the DAS44406-6 junctionsequence now contains the intended Cas12a recognition sequence areselected and grown to maturity. The T-DNA encoding the Cas12a reagentscan be segregated away from the modified junction sequence in asubsequent generation. The resultant INHT26 transgenic locus (SEQ ID NO:14) comprising the CgRRS and OgRRS (e.g., which each comprise SEQ ID NO:18) can be excised using Cas12a and a suitable gRNA which hybridizes toDNA comprising SEQ ID NO: 19 at both the OgRRS and the CgRRS.

The breadth and scope of the present disclosure should not be limited byany of the above-described embodiments.

What is claimed is:
 1. A DNA molecule comprising SEQ ID NO: 16 or
 17. 2.A processed transgenic soybean plant product comprising the DNA moleculeof claim
 1. 3. The processed transgenic soybean plant product of claim2, wherein said product comprises defatted or non-defatted soybean seedmeal.
 4. A biological sample containing the DNA molecule of claim
 1. 5.A nucleic acid molecule adapted for detection of genomic DNA comprisingthe DNA molecule of claim
 1. 6. The nucleic acid molecule of claim 5,wherein said nucleic acid molecule comprises a detectable label.
 7. Amethod of detecting a soybean plant cell comprising the INHT26transgenic locus comprising the DNA molecule of SEQ ID NO: 14,comprising the step of detecting the DNA molecule comprising SEQ ID NO:16 or 17.